Lafora&#39;s disease gene

ABSTRACT

Lafora&#39;s disease in humans is characterized by the mutation or deletion of an EPM2A gene, which encodes a protein, Laforin, having a tyrosine phosphatase domain. Many different sequence mutations, including microdeletions, in EPM2A co-segregate with, Lafora&#39;s disease. Accordingly, detection of mutations in EPM2 are useful in diagnosing Lafora&#39;s disease.

This application is a national phase entry application of PCT/CA99/00646filed Jul. 20, 1999, which claims priority from U.S. provisionalapplication No. 60/130,269 filed Apr. 21, 1999 (now abandoned) and U.S.provisional application No. 60/093,495 filed Jul. 20, 1998 (nowabandoned), all of which are incorporated herein by reference in theirentirety.

FIELD OF THE INVENTION

The invention relates to a novel gene, EPM2A, that is involved inLafora's disease; the protein, Laforin, encoded by the gene; and methodsof diagnosing and treating Lafora's disease.

BACKGROUND OF THE INVENTION

The epilepsies constitute one of the most common neurological disordersaffecting 40 million people worldwide (1). Within the spectrum ofepileptic syndromes is a group of heterogeneous inherited disordersnamed the Progressive Myoclonus Epilepsies (PME) in which progressiveneurological decline and worsening primarily myoclonic seizures followan initial period of normal development (2,3,4). Lafora's disease (LD)is an autosomal recessive and genetically heterogeneous form ofProgressive Myoclonus Epilepsy characterized by polyglucosan inclusionsseizures and cumulative neurological deterioration. The onset occursduring late childhood and usually results in death within a decade offirst symptoms. With few exceptions, patients with LD follow ahomogeneous clinical course (4) despite the existence of genetic locusheterogeneity (5). Biopsy (or autopsy) of various tissues includingbrain, liver, muscle, and skin reveals characteristic periodicacid-Schiff positive polyglucosan inclusions (Lafora bodies) (6-9).Substantial biochemical and histological studies of these bodies suggestLD is a generalized storage disease (8,10,11), but the presumedenzymatic defect remains unknown.

Linkage analysis and homozygosity mapping initially localized a Lafora'sdisease locus (EPM2A) to a region at chromosome 6q23-q25 bounded by thegenetic markers D6S1003 and D6S311 (12,13). However, there is a need inthe art to more clearly define the region(s) mutated in Lafora's diseaseto allow for the development of accurate diagnostic assays for Lafora'sdisease. More specifically, there is a need to sequence; the geneassociated with Lafora's Disease and to identify mutations and/ordeletions in the gene that are causative of Lafora's Disease.

SUMMARY OF THE INVENTION

The present inventors have identified a novel gene, EPM2A, that isdeleted or mutated in people with Lafora's disease. Using a positionalcloning approach the inventors have identified at chromosome 6q24 theEPM2A gene that encodes a protein with consensus amino acid sequenceindicative of a tyrosine phosphatase. Accordingly, the present inventionprovides an isolated nucleic acid molecule containing a sequenceencoding an active catalytic site of a protein tyrosine phosphatasewhich is associated with Lafora's disease.

In one embodiment of the invention, an isolated nucleic acid molecule isprovided having a sequence as shown in SEQ.ID.NO.:1 or FIG. 13.

Preferably, the purified and isolated nucleic acid molecule comprises:

(a) a nucleic acid sequence as shown in SEQ.ID.NO.:1 and FIG. 13,wherein T can also be U;

(b) nucleic acid sequences complementary to (a);

(c) nucleic acid sequences which are homologous to (a) or (b);

(d) a fragment of (a) to (c) that is at least 15 bases, preferably 20 to30 bases, and which will hybridize to (a) to (d) under stringenthybridization conditions; or

(e) a nucleic acid molecule differing from any of the nucleic acids of(a) to (c) in codon sequences due to the degeneracy of the genetic code.

Fourteen different mutations in EPM2A in 24 families have been foundthat co-segregate with Lafora's disease. These alterations would bepredicted to abolish or cause deleterious effects on the proteinproduct, Laforin, resulting in the primary defect in a large portion ofpatients with the disease. Accordingly, the present invention provides amethod of detecting Lafora's disease comprising detecting a mutation ordeletion in the EPM2A gene in a sample from a mammal. A mutation can bedetected by sequencing the EPM2A gene, in particular in the region inthe gene between markers D6S1003 and D6S1042, in a patient and comparingthe sequence to the wild type EPM2A sequence shown in FIG. 13 todetermine if a mutation or deletion is present. A mutation or deletioncan also be detected by assaying for the protein product encoded byEPM2A, Laforin.

Other features and advantages of the present invention will becomeapparent from the following detailed description. It should beunderstood, however, that the detailed description and the specificexamples while indicating preferred embodiments of the invention aregiven by way of illustration only, since various changes andmodifications within the spirit and scope of the invention will becomeapparent to those skilled in the art from this detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described in relation to the drawings inwhich:

FIG. 1 is a physical map of the Lafora's disease critical region.

FIG. 2A shows a refined mapping of the Lafora disease gene for Laforafamily LD39.

FIG. 2B is for Lafora family LD-L4.

FIG. 3 shows overlapping cDNA clones aligned with genomic DNA segments.

FIG. 4A is the nucleotide sequence (SEQ ID NO: 3) and predicted aminoacid sequence (SEQ ID NO: 4) of EPM2A (incomplete).

FIG. 4B is an amino acid sequence of the carboxy terminus of transcriptA (SEQ ID NO: 25) compared to transcript B (SEQ ID NO: 26).

FIG. 4C shows the PTP action sites (SEQ ID NOS: 27-32, respectively) ofEPMA2A, MTMI (Swiss post. C13496), PTEN (Swiss post, 000633, PTP 1B(Swiss post APT P61F (GenBank L14849) and viral PTP (Swiss post Af003534).

FIG. 5 is a Northern blot showing RNA expression pattern of EPM2A.

FIG. 6A shows representative mutations found in Lafora's family LD16.

FIG. 6B shows Lafora's family LD-33.

FIG. 7 is a nucleotide sequence of transcript A cDNA of the EPM2A gene(SEQ.ID.NO.:3).

FIG. 8 is the predicted amino acid sequence of transcript A(SEQ.ID.NO.:4).

FIG. 9 is a nucleotide sequence of transcript B cDNA of the EPM2A gene(SEQ.ID.NO.:5).

FIG. 10 is the predicted amino acid sequence of transcript B(SEQ.ID.NO.:6).

FIG. 11 is a refined map of the deletion breakpoints in families, LD-L4,LD9 and LD1.

FIG. 12A is a restriction map of PCR products with primers H1F/PTPR.

FIG. 12B is the HaeIII and PstI digestion of the H1F/PTPR PCR product.

FIG. 13 is the complete nucleic acid sequence of EPM2A. This is alsoshown in SEQ.ID.NO.:1.

FIG. 14 is the complete amino acid sequence of EPM2A. This is also shownin SEQ.ID.NO.:2.

DETAILED DESCRIPTION OF THE INVENTION

The present inventors constructed a high resolution physical map acrossthe EPM2A gene to provide additional genetic and physical mappingreagents for refined localization of the disease gene. It was determinedthat the previously established critical region encompassedapproximately 1.2 Mb of DNA. The map allowed the positioning of thelocation of 7 genetic markers, the metabotropic glutamate receptor 1(GRM1) gene, and 6 expressed sequence tags (EST) clusters (tentativelynamed LDCR1-LDCR6), within the interval (FIG. 1). The genetic markerswere then used to test for regions of homozygosity in each of the 30families with Lafora's disease that appeared genotypically to arise dueto mutations in a gene at 6q23-q25. In a single family; (LD39), anextended chain of homozygous markers within the previously establishedcritical region allowed the inventors to, tentatively, redefine thetelomeric boundary at D6S1042 (FIG. 2A). Simultaneously, a homozygousdeletion of marker D6S1703 in the affected of a consanguineous family(LD-L4) (FIG. 2B) was detected. This observation confirmed the newlydefined critical region to that 600 kb of DNA between D6S1003 andD6S1042, but more importantly, pinpointed the site of the disease genewithin this region.

I. Nucleic Acid Molecules of the Invention

As hereinbefore mentioned, the present invention relates to isolatednucleic acid molecules that are involved in Lafora's disease. The term“isolated” refers to a nucleic acid substantially free of cellularmaterial or culture medium when produced by recombinant DNA techniques,or chemical precursors, or other chemicals when chemically synthesized.The term “nucleic acid” is intended to include DNA and RNA and can beeither double stranded or single stranded.

Broadly stated, the present invention provides an isolated nucleic acidmolecule containing a sequence encoding an active catalytic site of aprotein tyrosine phosphatase which is associated with Lafora's disease.The isolated nucleic acid molecule is preferably the EPM2A geneassociated with Lafora's disease. In an embodiment of the invention, theisolated nucleic acid molecule has a sequence as shown in SEQ.ID.NO.:1and FIG. 13.

Preferably, the purified and isolated nucleic acid molecule comprises

(a) a nucleic acid sequence as shown in SEQ.ID.NO.:1 and FIG. 13,wherein T can also be U;

(b) nucleic add sequences complementary to (a);

(c) nucleic add sequences which are homologous to (a) or (b);

(d) a fragment of (a) to (c) that is at least 15 bases, preferably 20 to30 bases, and which will hybridize to (a) to (d) under stringenthybridization conditions; or

(e) a nucleic acid molecule differing from any of the nucleic adds of(a) to (c) in codon sequences due to the degeneracy of the genetic code.

The inventors have also isolated alternate forms of EPM2A which aregenerally referred to as transcript A and transcript B, herein. Thenucleic acid 'sequence of transcript A is shown in SEQ.ID.NO.:3 and FIG.7. The nucleic acid sequence of transcript B is shown in SEQ.ID.NO.:5and FIG. 9. The amino acid sequence encoded by transcript A is shown inSEQ.ID.NO.:4 and FIG. 8. The amino acid sequence encoded by transcript Bis shown in SEQ.ID.NO.:6 and FIG. 10.

The nucleic acid sequences shown in SEQ.ID.NOS.: 1, 3 and 5 (or FIGS.13, 7 and 9, respectively) can be collectively referred to herein as“the nucleic acid molecules of the invention”. The amino acid sequencesshown in SEQ.ID.NOS.: 2,4 and 6 (or FIGS. 4A, 8 and 10, respectively)may be collectively referred to herein as the “proteins of theinvention”.

It will be appreciated that the invention includes nucleic acidmolecules encoding truncations of the proteins of the invention, andanalogs and homologs of the proteins of the invention and truncationsthereof, as described below. It will further be appreciated that variantforms of the nucleic acid molecules of the invention which arise byalternative splicing of an mRNA corresponding to a cDNA of the inventionare encompassed by the invention.

Further, it will be appreciated that the invention includes nucleic acidmolecules comprising nucleic acid sequences having substantial sequencehomology with the nucleic acid sequences of the invention and fragmentsthereof. The term “sequences having substantial sequence homology” meansthose nucleic acid sequences which have slight or inconsequentialsequence variations from these sequences, i.e. the sequences function insubstantially the same manner to produce functionally equivalentproteins. The variations may be attributable to local mutations orstructural modifications.

Generally, nucleic acid sequences having substantial homology includenucleic acid sequences having at least 70%, preferably 80-90% identitywith the nucleic acid sequences of the invention.

Another aspect of the invention provides a nucleic acid molecule, andfragments thereof having at least 15 bases, which hybridizes to thenucleic acid molecules of the invention under hybridization conditions,preferably stringent hybridization conditions. Appropriate stringencyconditions which promote DNA hybridization are known to those skilled inthe art, or may be found in Current Protocols in Molecular Biology, JohnWiley & Sons, N.Y. (1989), 6.3.1-6.3.6. For example, the following maybe employed: 60×sodium chloride/sodium citrate (SSC) at about 45° C.,followed by a wash of 2.0×SSC at 50° C. The stringency may be selectedbased on the conditions used in the wash step. For example, the saltconcentration in the wash step can be selected from a high stringency ofabout 0.2×SSC at 50° C. In addition, the temperature in the wash stepcan be at high stringency conditions, at about 65° C.

Isolated and purified nucleic acid molecules having sequences whichdiffer from the nucleic acid sequence shown in SEQ.ID.NO.:1 orSEQ.ID.NO.:3 or SEQ.ID.NO.:5 due to degeneracy in the genetic code arealso within the scope of the invention.

Nucleic acid molecules from the EPM2A gene can be isolated by preparinga labelled nucleic acid probe based on all or part of the nucleic acidsequences as shown in SEQ.ID.NO.:1 and FIG. 13, and using this labellednucleic acid probe to screen an appropriate DNA library (e.g. a CDNA orgenomic DNA library). Nucleic acids isolated by screening of a cDNA orgenomic DNA library can be sequenced by standard techniques.

Nucleic acid molecules of the invention can also be isolated byselectively amplifying a nucleic acid using the polymerase chainreaction (PCR) methods and cDNA or genomic DNA. It is possible to designsynthetic oligonucleotide primers from the nucleic acid molecules asshown in SEQ.ID.NO.:1 and FIG. 13, for use in PCR. A nucleic acid can beamplified from cDNA or genomic DNA using these oligonucleotide primersand standard PCR amplification techniques. The nucleic acid so amplifiedcan be cloned into an appropriate vector and characterized by DNAsequence analysis. It will be appreciated that cDNA may be prepared frommRNA, by isolating total cellular mRNA by a variety of techniques, forexample, by using the guanidinium-thiocyanate extraction procedure ofChirgwin et al., Biochemistry, 18, 5294-5299 (1979). cDNA is thensynthesized from the mRNA using reverse transcriptase (for example,Moloney MLV reverse transcriptase available from Gibco/BRL, Bethesda,Md., or AMV reverse transcriptase available from Seikagaku America,Inc., St Petersburg, Fla.).

An isolated nucleic acid molecule of the invention which is RNA can beisolated by cloning a cDNA encoding a novel protein of the inventioninto an appropriate vector which allows for transcription of the cDNA toproduce an RNA molecule which encodes the Laforin protein. For example,a cDNA can be cloned downstream of a bacteriophage promoter, (e.g. a T7promoter) in a vector, cDNA can be transcribed in vitro with T7polymerase, and the resultant RNA can be isolated by standardtechniques.

A nucleic acid molecule of the invention may also be chemicallysynthesized using standard techniques. Various methods of chemicallysynthesizing polydeoxynucleotides are known, including solid-phasesynthesis which, like peptide synthesis, has been fully automated incommercially available DNA synthesizers (See e.g., Itakura et al. U.S.Pat. No. 4,598,049; Caruthers et al. U.S. Pat. No. 4,458,066; andItakura U.S. Pat. Nos. 4,401,796 and 4,373,071).

The initiation codon and untranslated sequences of the nucleic acidmolecules of the invention may be determined using currently availablecomputer software designed for the purpose, such as PC/Gene(IntelliGenetics Inc., Calif.). Regulatory elements can be identifiedusing conventional techniques. The function of the elements can beconfirmed by using these elements to express a reporter gene which isoperatively linked to the elements. These constructs may be introducedinto cultured cells using standard procedures. In addition toidentifying regulatory elements in DNA, such constructs may also be usedto identify proteins interacting with the elements, using techniquesknown in the art.

The sequence of a nucleic acid molecule of the invention may be invertedrelative to its normal presentation for transcription to produce anantisense nucleic acid molecule. Preferably, an antisense sequence isconstructed by inverting a region preceding the initiation codon or anunconserved region. In particular, the nucleic acid sequences containedin the nucleic acid molecules of the invention or a fragment thereof,preferably a nucleic acid sequence shown in SEQ.ID.NO.:1, SEQ.ID.NO.:3or SEQ.ID.NO.:5 may be inverted relative to its normal presentation fortranscription to produce antisense nucleic acid molecules.

The antisense nucleic acid molecules of the invention or a fragmentthereof, may be chemically synthesized using naturally occurringnucleotides or variously modified nucleotides designed to increase thebiological stability of the molecules or to increase the physicalstability of the duplex formed with mRNA or the native gene e.g.phosphorothioate derivatives and acridine substituted nucleotides. Theantisense sequences may be produced biologically using an expressionvector introduced into cells in the form of a recombinant plasmid,phagemid or attenuated virus in which antisense sequences are producedunder the control of a high efficiency regulatory region, the activityof which may be determined by the cell type into which the vector isintroduced.

The invention also provides nucleic acids encoding fusion proteinscomprising a novel protein of the invention and a selected protein, or aselectable marker protein (see below).

II. Novel Proteins of the Invention

The invention further includes an isolated protein encoded by thenucleic acid molecules of the invention. Within the context of thepresent invention, a protein of the invention may include variousstructural forms of the primary protein which retain biologicalactivity.

Broadly stated, the present invention provides an isolated proteincontaining a tyrosine phosphatase domain and which is associated withLafora's disease.

In a preferred embodiment of the invention, the protein has the aminoacid sequence as shown in SEQ ID NO:2 and FIG. 14. In anotherembodiment, the protein has the amino acid sequence shown inSEQ.ID.NO.:4 (or FIG. 8) or SEQ.ID.NO.:6 (or FIG. 10).

In addition to full length amino acid sequences the proteins of thepresent invention also include truncations of the protein, and analogs,and homologs of the protein and truncations thereof as described herein.Truncated proteins may comprise peptides of at least fifteen amino acidresidues.

Analogs of the protein having the amino acid sequence shown inSEQ.ID.NO.:2 (FIG. 14) or SEQ.ID.NO.:4 (FIG. 8) or SEQ.ID.NO.:6 (FIG.10) and/or truncations thereof as described herein, may include, but arenot limited to an amino acid sequence containing one or more amino acidsubstitutions, insertions, and/or deletions. Amino acid substitutionsmay be of a conserved or non-conserved nature. Conserved amino acidsubstitutions involve replacing one or more amino acids of the proteinsof the invention with amino acids of similar charge, size, and/orhydrophobicity characteristics. When only conserved substitutions aremade the resulting analog should be functionally equivalent.Non-conserved substitutions involve replacing one or more amino acids ofthe amino acid sequence with one or more amino acids which possessdissimilar charge, size, and/or hydrophobicity characteristics.

One or more amino acid insertions may be introduced into the ammio acidsequences shown in SEQ.ID.NO.:2 (FIG. 14) or SEQ.ID.NO.:4 (FIG. 8) orSEQ.ID.NO.:6 (FIG. 10). Amino acid insertions may consist of singleamino acid residues or sequential amino acids ranging from 2 to 15 aminoacids in length. For example, amino acid insertions may be used todestroy target sequences so that the protein is no longer active. Thisprocedure may be used in vivo to inhibit the activity of a protein ofthe invention.

Deletions may consist of the removal of one or more amino acids, ordiscrete portions from the amino acid sequence shown in SEQ.ID.NO.:2(FIG. 14) or SEQ.ID.NO.:4 (FIG. 8) or SEQ.ID.NO.:6 (FIG. 10). Thedeleted amino acids may or may not be contiguous. The lower limit lengthof the resulting analog with a deletion mutation is about 10 aminoacids, preferably 100 amino acids.

Analogs of a protein of the invention may be prepared by introducingmutations in the nucleotide sequence encoding the protein. Mutations innucleotide sequences constructed for expression of analogs of a proteinof the invention must preserve the reading frame of the codingsequences. Furthermore, the mutations will preferably not createcomplementary regions that could hybridize to produce secondary mRNAstructures, such as loops or hairpins, which could adversely affecttranslation of the receptor mRNA.

Mutations may be introduced at particular loci by synthesizingoligonucleotides containing a mutant sequence, flanked by restrictionsites enabling ligation to fragments of the native sequence. Followingligation, the resulting reconstructed sequence encodes an analog havingthe desired amino acid insertion, substitution, or deletion.

Alternatively, oligonucleotide-directed site specific mutagenesisprocedures may be employed to provide an altered gene having particularcodons altered according to the substitution, deletion, or insertionrequired. Deletion or truncation of a protein of the invention may alsobe constructed by utilizing convenient restriction endonuclease sitesadjacent to the desired deletion. Subsequent to restriction, overhangsmay be filled in, and the DNA religated. Exemplary methods of making thealterations set forth above are disclosed by Sambrook et al (MolecularCloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor LaboratoryPress, 1989).

The proteins of the invention also include homologs of the amino acidsequence shown in SEQ.ID.NO.:2 (FIG. 14) or SEQ.ID.NO.:4 (FIG. 8) orSEQ.ID.NO.:6 (FIG. 10) and/or truncations thereof as described herein.Such homologs are proteins whose amino acid sequences are comprised ofamino acid sequences that hybridize under stringent hybridizationconditions (see discussion of stringent hybridization conditions herein)with a probe used to obtain a protein of the invention. Preferably,homologs of a protein of the invention will have a tyrosine phosphataseregion which is characteristic of the protein.

A homologous protein includes a protein with an amino acid sequencehaving at least 70%, preferably 80-90% identity with the amino acidsequence as shown in SEQ.ID.NO.:2 (FIG. 14) or SEQ.ID.NO.:4 (FIG. 8) orSEQ.ID.NO.:6 (FIG. 10).

The invention also contemplates isoforms of the proteins of theinvention. An isoform contains the same number and kinds of amino acidsas a protein of the invention, but the isoform has a different molecularstructure. The isoforms contemplated by the present invention are thosehaving the same properties as a protein of the invention as describedherein.

The present invention also includes a protein of the inventionconjugated with a selected protein, or a selectable marker protein (seebelow) to produce fusion proteins. Additionally, immunogenic portions ofa protein of the invention are within the scope of the invention.

The proteins of the invention (including truncations, analogs, etc.) maybe prepared using recombinant DNA methods. Accordingly, the nucleic acidmolecules of the present invention having a sequence which encodes aprotein of the invention may be incorporated in a known manner into anappropriate expression vector which ensures good expression of theprotein. Possible expression vectors include but are not limited tocosmids, plasmids, or modified viruses (e.g. replication defectiveretroviruses, adenoviruses and adeno-associated viruses), so long as thevector is compatible with the host cell used. The expression vectors are“suitable for transformation of a host cell”, means that the expressionvectors contain a nucleic acid molecule of the invention and regulatorysequences selected on the basis of the host cells to be used forexpression, which is operatively linked to the nucleic acid molecule.Operatively linked is intended to mean that the nucleic acid is linkedto regulatory sequences in a manner which allows expression of thenucleic acid.

The invention therefore contemplates a recombinant expression vector ofthe invention containing a nucleic acid molecule of the invention, or afragment thereof, and the necessary regulatory sequences for thetranscription and translation of the inserted protein-sequence. Suitableregulatory sequences may be derived from a variety of sources, includingbacterial, fungal, or viral genes (For example, see the regulatory;sequences described in Goeddel, Gene Expression Technology: Methods inEnzymology 185, Academic Press, San Diego, Calif. (1990). Selection ofappropriate regulatory sequences is dependent on the host cell chosen,and may be readily accomplished by one of ordinary skill in the art.Examples of such regulatory sequences include: a transcriptionalpromoter and enhancer or RNA polymerase binding sequence, a ribosomalbinding sequence, including a translation initiation signal.Additionally, depending on the host cell chosen and the vector employed,other sequences, such as an origin of replication, additional DNArestriction sites, enhancers, and sequences conferring inducibility oftranscription may be incorporated into the expression vector. It willalso be appreciated that the necessary regulatory sequences may besupplied by the native protein and/or its flanking regions.

The invention further provides a recombinant expression vectorcomprising a DNA nucleic acid molecule of the invention cloned into theexpression vector in an antisense orientation That is, the DNA moleculeis operatively linked to a regulatory sequence in a manner which allowsfor expression, by transcription of the DNA molecule, of an RNA moleculewhich is antisense to a nucleotide sequence comprising the nucleotidesas shown SEQ.ID.NO.:1, SEQ.ID.NO.:3 or SEQ.ID.NO.:5. Regulatorysequences operatively linked to the antisense nucleic acid can be chosenwhich direct the continuous expression of the antisense RNA molecule.

The recombinant expression vectors of the invention may also contain aselectable marker gene which facilitates the selection of host cellstransformed or transfected with a recombinant molecule of the invention.Examples of selectable marker genes are genes encoding a protein such asG418 and hygromycin which confer resistance to certain drugs,β-galactosidase, chloramphenicol acetyltransferase, or fireflyluciferase. Transcription of the selectable marker gene is monitored bychanges in the concentration of the selectable marker protein such asβ-galactosidase, chloramphenicol acetyltransferase, or fireflyluciferase. If the selectable marker gene encodes a protein conferringantibiotic resistance such as neomycin resistance transformant cells canbe selected with G418. Cells that have incorporated the selectablemarker gene will survive, while the other cells die. This makes itpossible to visualize and assay for expression of recombinant expressionvectors of the invention and in particular to determine the effect of amutation on expression and phenotype. It will be appreciated thatselectable markers can be introduced on a separate vector from thenucleic acid of interest.

The recombinant expression vectors may also contain genes which encode afusion moiety which provides increased expression of the recombinantprotein; increased solubility of the recombinant protein; and aid in thepurification of a target recombinant protein by acting as a ligand inaffinity purification. For example, a proteolytic cleavage site may beadded to the target recombinant protein to allow separation of therecombinant protein from the fusion moiety subsequent to purification ofthe fusion protein.

Recombinant expression vectors can be introduced into host cells toproduce a transformant host cell. The term “transformant host cell” isintended to include prokaryotic and eukaryotic cells which have beentransformed or transfected with a recombinant expression vector of theinvention. The terms “transformed with”, “transfected with”,“transformation” and “transfection” are intended to encompassintroduction of nucleic acid (e.g. a vector) into a cell by one of manypossible techniques known in the art. Prokaryotic cells can betransformed with nucleic acid by, for example, electroporation orcalcium-chloride mediated transformation. Nucleic acid can be introducedinto mammalian cells via conventional techniques such as calciumphosphate or calcium chloride co-precipitation, DEAE-dextran-mediatedtransfection, lipofectin, electroporation or microinjection. Suitablemethods for transforming and transfecting host cells can be found inSambrook et al. (Molecular Cloning: A laboratory Manual, 2nd Edition,Cold Spring Harbor Laboratory press (1989)), and other laboratorytextbooks.

Suitable host cells include a wide variety of prokaryotic and eukaryotichost cells. For example, the proteins of the invention may be expressedin bacterial cells such as E. coli, insect cells (using baculovirus),yeast cells or mammalian cells. Other suitable host cells can be foundin Goeddel, Gene Expression Technology: Methods in Enzymology 185,Academic Press, San Diego, Calif. (1991).

The proteins of the invention may also be prepared by chemical synthesisusing techniques well known in the chemistry of proteins such as solidphase synthesis (Merrifield, 1964, J. Am. Chem. Assoc. 85:2149-2154) orsynthesis in homogenous solution (Houbenweyl, 1987, Methods of OrganicChemistry, ed. E. Wansch, Vol. 15 I and II, Thieme, Stuttgart).

III. Applications

A. Diagnostic Applications

As previously mentioned, the present inventors have isolated andsequenced a novel gene EPM2A and have shown that it is deleted ormutated in people with Lafora's disease. As a result, the presentinvention also includes a method of detecting Lafora's disease bydetecting a mutation or deletion in the Lafora's disease gene orprotein.

i) Detecting Mutations in the Nucleic Acid Sequence

In one embodiment, the present invention provides a method for detectingLafora's disease comprising detecting a deletion or mutation in theLafora's disease gene in a sample obtained from an animal, preferably amammal, more preferably a human. Preferably, the invention provides amethod of detecting Lafora's disease comprising detecting a deletion ormutation in the Lafora's disease gene in the region between markersD6S1003 and D6S1042.

The Examples and Tables 1 to 3 summarize some of the mutations found inEPM2A in patient's with Lafora's Disease. Screening assays can bedeveloped for each of the mutations. Details of screening assays thatmay be employed for the 3 common mutations are provided in Example 3.

One of the common EPM2A mutations is a C→T nonsense mutation of thesecond base pair of exon 4 found at position 721 in FIG. 13. Thismutation destroys the recognition site for the restriction enzymeHaeIII. Accordingly, the C to T mutation can be detected in a sample bya method comprising:

(a) amplifying the nucleic acid sequences in the sample with primers H1F(5′-GAATGCTCTTTCCACTTTGC-3) (SEQ ID NO: 7) and PTPR(5′-GGCTCCTTAGGGAAATCAG-3′) (SEQ ID NO: 8) in a polymerase chainreaction;

(b) digesting the amplified sequences with the restriction endonucleaseHaeIII; and

(c) determining the size of the digested sequences wherein the presenceof a fragment of approximately 199 bp indicates the sample is from ananimal with Lafora's disease or an animal that is a carrier of Lafora'sdisease.

Another common mutation in EMP2A is a G→A mutation of base pair 115 inexon 4 (position 836 in FIG. 13). This mutation creates a new PstIrestriction site in the 520 bp DNA fragment that is amplified by primersH1F and PTPR, which is not found in normal, non-carrier individuals.Consequently, the present invention provides a method for detecting a Gto A mutation in EMP2A by a method comprising:

(a) amplifying the nucleic acid sequences in the sample with primers H1F(5′-GAATGCTCTTTCCACTTTGC-3) (SEQ ID NO: 7) and PTPR(5′-GGCTCCTTAGGGAAATCAG-3′) (SEQ ID NO: 8) in a polymerase chainreaction;

(b) digesting the amplified sequences with the restriction endonucleasePst1; and

(c) determining the size of the digested sequences wherein the presenceof at least one fragment of approximately 520 bp indicates that thesample is from an animal that does not have Lafora's disease or ananimal that is a carrier of Lafora's disease. Persons with Lafora'sdisease will have two variant bands of 195 base pairs and 350 basepairs.

Many families with Lafora's disease have deletions of EPM2A. Patientshomozygous for these deletions can be detected by the absence of PCRamplification products using primers JRGXBF/JRGXBR which amplify thedeleted region. Consequently, the present invention includes a methodfor determining a deletion in the EMP2A gene by a method comprising:

(a) amplifying the nucleic acid sequences in the sample with primersJRGXBF (5′-TCCATTGTGCTAATGCTATCTC-3′) (SEQ ID NO: 9) and JRGXBR(5′-TCAGCTTGCTTTGAGGATATTT-3′) (SEQ ID NO: 10) in a polymerase chainreaction; and

(b) detecting amplified sequence wherein the absence of an amplifiedsequence indicates that the sample is from an animal with Lafora'sdisease.

One skilled in the art will appreciate that other methods, in additionto the ones discussed above and in the examples, can be used to detectmutations in the EPM2A gene. For example, in order to isolate nucleicacids from the Lafora's disease gene in a sample, one can preparenucleotide probes from the nucleic acid sequences of the invention. Inaddition, the nucleic acid probes described herein (for example, seeFIG. 1) can also be used. A nucleotide probe may be labelled with adetectable marker such as a radioactive label which provides for anadequate signal and has sufficient half life such as ³²P, ³H, ¹⁴C or thelike. Other detectable markers which may be used include antigens thatare recognized by a specific labelled antibody, fluorescent compounds,enzymes, antibodies specific for a labelled antigen, andchemiluminescent compounds. An appropriate label may be selected havingregard to the rate of hybridization and binding of the probe to thenucleotide to be detected and the amount of nucleotide available forhybridization.

Accordingly, the present invention also relates to a method of detectingthe presence of a nucleic acid molecule from the EPM2A gene in a samplecomprising contacting the sample under hybridization conditions with oneor more of nucleotide probes which hybridize to the nucleic acidmolecules and are labelled with a detectable marker, and determining thedegree of hybridization between the nucleic acid molecule in the sampleand the nucleotide probes. Preferably, the nucleic acid probes hybridizewith a portion of the EPM2A gene containing a mutation site in Lafora'sdisease, for example, in the region between marker DS61003 and DS61042.

Hybridization conditions which may be used in the methods of theinvention are known in the art and are described for example in SambrookJ, Fritch E F, Maniatis T. In: Molecular Cloning, A LaboratoryManual,1989. (Nolan C, Ed.), Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y. The hybridization product may be assayed usingtechniques known in the art. The nucleotide probe may be labelled with adetectable marker as described herein and the hybridization product maybe assayed by detecting the detectable marker or the detectable changeproduced by the detectable marker.

Prior to hybridizing a sample with DNA probes, the sample can be treatedwith primers that flank the EPM2A gene in order to amplify the nucleicacid sequences in the sample. The primers used may be the ones describedin the present application. For example, primers specific for thetranscript A include 266F and GSP3. Primers for the transcript B includeAA490925F and AA490925R. In addition, the sequence of the EPM2A geneprovided herein also permits the identification and isolation, orsynthesis of new nucleotide sequences which may be used as primers toamplify a nucleic acid molecule of the invention, for example in thepolymerase chain reaction (PCR) which is discussed in more detail below.The primers may be used to amplify the genomic DNA of other species. ThePCR amplified sequences can be examined to determine the relationshipbetween the genes of various species.

The length and bases of the primers for use in the PCR are selected sothat they will hybridize to different strands of the desired sequenceand at relative positions along the sequence such that an extensionproduct synthesized from one primer when it is separated from itstemplate can serve as a template for extension of the other primer intoa nucleic acid of defined length. Primers which may be used in theinvention are oligonucleotides i.e. molecules containing two or moredeoxyribonucleotides of the nucleic acid molecule of the invention whichoccur naturally as in a purified restriction endonuclease digest or areproduced synthetically using techniques known in the art such as forexample phosphotriester and phosphodiester methods (See Good et al Nucl.Acid Res 4:2157, 1977) or automated techniques (See for example,Conolly, B. A. Nucleic Acids Res. 15:15(7): 3131, 1987). The primers arecapable of acting as a point of initiation of synthesis when placedunder conditions which permit the synthesis of a primer extensionproduct which is complementary to the DNA sequence of the invention i.e.in the presence of nucleotide substrates, an agent for polymerizationsuch as DNA polymerase and at suitable temperature and pH. Preferably,the primers are sequences that do not form secondary structures by basepairing with other copies of the primer or sequences that form a hairpin configuration. The primer preferably contains between about 7 and 25nucleotides.

The primers may be labelled with detectable markers which allow fordetection of the amplified products. Suitable detectable markers areradioactive markers such as P-32, S-35, I-125, and H-3, luminescentmarkers such as chemiluminescent markers, preferably luminol, andfluorescent markers, preferably dansyl, chloride,fluorcein-5-isothiocyanate, and 4-fluor-7-nitrobenz-2-axa-1,3 diazole,enzyme markers such as horseradish peroxidase, alkaline phosphatase,β-galactosidase, acetylcholinesterase, or biotin.

It will be appreciated that the primers may contain non-complementarysequences provided that a sufficient amount of the primer contains asequence which is complementary to a nucleic acid molecule of theinvention or oligonucleotide fragment thereof, which is to be amplified.Restriction site linkers may also be incorporated into the primersallowing for digestion of the amplified products with the appropriaterestriction enzymes facilitating cloning and sequencing of the amplifiedproduct.

In an embodiment of the invention a method of determining the presenceof a nucleic acid molecule of the invention is provided comprisingtreating the sample with primers which are capable of amplifying thenucleic acid molecule or a predetermined oligonucleotide fragmentthereof in a polymerase chain reaction to form, amplified sequences,under conditions which permit the formation of amplified sequences and,assaying for amplified sequences.

The polymerase chain reaction refers to a process for amplifying atarget nucleic acid sequence as generally described in Innis et al,Academic Press, 1990 in Mullis el al., U.S. Pat. No. 4,863,195 andMullis, U.S. Pat. No. 4,683,202 which are incorporated herein byreference. Conditions for amplifying a nucleic acid template aredescribed in M. A. Innis and D. H. Gelfand, PCR Protocols, A Guide toMethods and Applications M. A. Innis, D. H. Gelfand, J. J. Sninsky andT. J. White eds, pp3-12, Academic Press 1989, which is also incorporatedherein by reference.

The amplified products can be isolated and distinguished based on theirrespective sizes using techniques known in the art. For example, afteramplification, the DNA sample can be separated on an agarose gel andvisualized, after staining with ethidium bromide, under ultra violet(UW) light. DNA may be amplified to a desired level and a furtherextension reaction may be performed to incorporate nucleotidederivatives having detectable markers such as radioactive labelled orbiotin labelled nucleoside triphosphates. The primers may also belabelled with detectable markers as discussed above. The detectablemarkers may be analyzed by restriction and electrophoretic separation orother techniques known in the art.

The conditions which may be employed in the methods of the inventionusing PCR are those which permit hybridization and amplificationreactions to proceed in the presence of DNA in a sample and appropriatecomplementary hybridization primers. Conditions suitable for thepolymerase chain reaction are generally known in the art. For example,see M. A. Innis and D. H. Gelfand, PCR Protocols, A guide to Methods andApplications M. A. Innis, D. H. Gelfand, J. J. Sninsky and T. J. Whiteeds, pp3-12, Academic Press 1989, which is incorporated herein byreference. Preferably, the PCR utilizes polymerase obtained from thethermophilic bacterium Thermus aquatics (Taq polymerase, GeneAmp Kit,Perkin Elmer Cetus) or other thermostable polymerase may be used toamplify DNA template strands.

It will be appreciated that other techniques such as the Ligase ChainReaction (LCR) and NASBA may be used to amplify a nucleic acid moleculeof the invention (Barney in “PCR Methods and Applications”, August 1991,Vol.1(1), page 5, and European Published Application No. 0320308,published Jun. 14, 1989, and U.S. Ser. No. 5,130,238 to Malek).

(ii) Detecting the Laforin Protein

In another embodiment, the present invention provides a method fordetecting Lafora's disease comprising determining if the Laforin proteinis present in a sample from an animal.

The Laforin protein of the present invention may be detected in abiological sample using antibodies that are specific for Laforin usingvarious immunoassays that are discussed below.

Conventional methods can be used to prepare the antibodies. For example,by using a peptide from the Laforin protein of the invention, polyclonalantisera or monoclonal antibodies can be made using standard methods. Amammal, (e.g., a mouse, hamster, or rabbit) can be immunized with animmunogenic form of the peptide which elicits an antibody response inthe mammal. Techniques for conferring immunogenicity on a peptideinclude conjugation to carriers or other techniques well known in theart. For example, the peptide can be administered in the presence ofadjuvant. The progress of immunization can be monitored by detection ofantibody titers in plasma or serum. Standard ELISA or other immunoassayprocedures can be used with the immunogen as antigen to assess thelevels of antibodies. Following immunization, antisera can be obtainedand, if desired, polyclonal antibodies isolated from the sera.

To produce monoclonal antibodies, antibody producing cells (lymphocytes)can be harvested from an immunized animal and fused with myeloma cellsby standard somatic cell fusion procedures thus immortalizing thesecells and yielding hybridoma cells. Such techniques are well known inthe art, (e.g., the hybridoma technique originally developed by Kohlerand Milstein (Nature 256, 495-497 (1975)) as well as other techniquessuch as the human B-cell hybridoma technique (Kozbor et al., Immunol.Today 4, 72 (1983)), the EBV-hybridoma technique to produce humanmonoclonal antibodies (Cole et al. Monoclonal Antibodies in CancerTherapy (1985) Allen R. Bliss, Inc., pages 77-96), and screening ofcombinatorial antibody libraries (Huse et al., Science 246,1275 (1989)].Hybridoma cells can be screened immunochemnically for production ofantibodies specifically reactive with the peptide and the monoclonalantibodies can be isolated. Therefore, the invention also contemplateshybridoma cells secreting monoclonal antibodies with specificity for aprotein of the invention.

The term “antibody” as used herein is intended to include fragmentsthereof which also specifically react with a protein, of the invention,or peptide thereof. Antibodies can be fragmented using conventionaltechniques and the fragments screened for utility in the same manner asdescribed above. For example, F(ab′)₂ fragments can be generated bytreating antibody with pepsin. The resulting F(ab′)₂ fragment can betreated to reduce disulfide bridges to produce Fab′ fragments.

Chimeric antibody derivatives, i.e., antibody molecules that combine anon-human animal variable region and a human constant region are alsocontemplated within the scope of the invention. Chimeric antibodymolecules can include, for example, the antigen binding domain from anantibody of a mouse, rat, or other species, with human constant regions.Conventional methods may be used to make chimeric antibodies containingthe immunoglobulin variable region which recognizes a CipA protein (See,for example, Morrison et al., Proc. Natl Acad. Sci. U.S.A. 81,6851(1985); Takeda et al., Nature 314, 452 (1985), Cabilly et al., U.S. Pat.No. 4,816,567; Boss et al., U.S. Pat. No. 4,816,397; Tanaguchi et al.,European Patent Publication EP171496; European Patent Publication0173494, United Kingdom patent GB 2177096B).

Monoclonal or chimeric antibodies specifically reactive with a proteinof the invention as described herein can be further humanized byproducing human constant region chimeras, in which parts of the variableregions, particularly the conserved framework regions of theantigen-binding domain, are of human origin and only the hypervariableregions are of non-human origin. Such immunoglobulin molecules may bemade by techniques known in the art, (e.g., Teng et al., Proc. Natl.Acad. Sci. U.S.A, 80, 7308-7312 (1983); Kozbor et al., Immunology Today,4, 7279 (1983); Olsson et al., Meth. Enzymol., 92, 3-16 (1982)), and PCTPublication W092/06193 or EP 0239400). Humanized antibodies can also becommercially produced (Scotgen Limited, 2 Holly Road, Twickenham,Middlesex, Great Britain.)

Specific antibodies, or antibody fragments, reactive against a proteinof the invention may also be generated by screening expression librariesencoding immunoglobulin genes, or portions thereof, expressed inbacteria with peptides produced from the nucleic acid molecules of thepresent invention. For example, complete Fab fragments, VH regions andFV regions can be expressed in bacteria using phage expression libraries(See for example Ward et al., Nature 341, 544-546: (1989); Huse et al.,Science 246, 1275-1281 (1989); and McCafferty et al. Nature 348, 552-554(1990)).

Antibodies may also be prepared using DNA immunization. For example, anexpression vector containing a nucleic acid of the invention (asdescribed above) may be injected into a suitable animal such as mouse.The protein of the invention will therefore be expressed in vivo andantibodies will be induced. The antibodies can be isolated and preparedas described above for protein immunization.

The antibodies may be labelled with a detectable marker includingvarious enzymes, fluorescent materials, luminescent materials andradioactive materials. Examples of suitable enzymes include horseradishperoxidase, biotin, alkaline phosphatase, β-galactosidase, oracetylcholinesterase; examples of suitable fluorescent materials includeumbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine,dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; anexample of a luminescent material includes luminol; and examples ofsuitable radioactive material include S35, Cu64, Ga67, Zr-89, Ru-97,Tc-99m, Rh-105, Pd-109, In-111, I-123, I-125, I-131, Re-186, Au-198,Au-199, Pb-203, At-211, Pb-212 and Bi-212. The antibodies may also belabelled or conjugated to one partner of a ligand binding pair.Representative examples include avidin-biotin and riboflavin-riboflavinbinding protein. Methods for conjugating or labelling the antibodiesdiscussed above with the representative labels set forth above may bereadily accomplished using conventional techniques.

The antibodies reactive against proteins of the invention (e.g. enzymeconjugates or labelled derivatives) may be used to detect a protein ofthe invention in various samples, for example they may be used in anyknown immunoassays which rely on the binding interaction between anantigenic determinant of a protein of the invention and the antibodies.Examples of such assays are radioimmunoassays, enzyme immunoassays (e.g.ELISA), immunofluorescence, immuno-precipitation, latex agglutination,hemagglutination, and histochemical tests. Thus, the antibodies may beused to identify or quantify the amount of a protein of the invention ina sample in order to diagnose the presence of Lafora's disease.

In a method of the invention a predetermined amount of a sample orconcentrated sample is mixed with antibody or labelled antibody. Theamount of antibody used in the process is dependent upon the labellingagent chosen. The resulting protein bound to antibody or labelledantibody may be isolated by conventional isolation techniques, forexample, salting out, chromatography, electrophoresis, gel filtration,fractionation, absorption, polyacrylamide gel electrophoresis,agglutination, or combinations thereof.

The sample or antibody may be insolubilized, for example, the sample orantibody can be reacted using known methods with a suitable carrier.Examples of suitable carriers are Sepharose or agarose beads. When aninsolubilized sample or antibody is used protein bound to antibody orunreacted antibody is isolated by washing. For example, when the sampleis blotted onto a nitrocellulose membrane, the antibody bound to aprotein of the invention is separated from the unreacted antibody bywashing with a buffer, for example, phosphate buffered saline (PBS) withbovine serum albumin (BSA).

When labelled antibody is used, the presence of Laforin can bedetermined by measuring the amount of labelled antibody bound to aprotein of the invention in the sample or of the unreacted labelledantibody. The appropriate method of measuring the labelled material isdependent upon the labelling agent.

When unlabelled antibody is used in the method of the invention, thepresence of Laforin can be determined by measuring the amount ofantibody bound to the protein using substances that interactspecifically with the antibody to cause agglutination or precipitation.In particular, labelled antibody against an antibody specific for aprotein of the invention, can be added to the reaction mixture. Thepresence of a protein of the invention can be determined by a suitablemethod from among the already described techniques depending on the typeof labelling agent The antibody against an antibody specific for aprotein of the invention can be prepared and labelled by conventionalprocedures known in the art which have been described herein. Theantibody against an antibody specific for a protein of the invention maybe a species specific anti-immunoglobulin antibody or monoclonalantibody, for example, goat anti-rabbit antibody may be used to detectrabbit antibody specific for a protein of the invention.

(iii) Kits

The reagents suitable for carrying out the methods of the invention maybe packaged into convenient kits providing the necessary materials,packaged into suitable containers. Such kits may include all thereagents required to detect a nucleic acid molecule or protein of theinvention in a sample by means of the methods described herein, andoptionally suitable supports useful in performing the methods of theinvention.

In one embodiment of the invention, the kit includes primers which arecapable of amplifying a nucleic acid molecule of the invention or apredetermined oligonucleotide fragment thereof, all the reagentsrequired to produce the amplified nucleic acid molecule or predeterminedfragment thereof in the polymerase chain reaction, and means forassaying the amplified sequences. The kit may also include restrictionenzymes to digest the PCR products. In another embodiment of theinvention the kit contains a nucleotide probe which hybridizes with anucleic acid molecule of the invention, reagents required forhybridization of the nucleotide probe with the nucleic acid molecule,and directions for its use. In a further embodiment of the invention thekit includes antibodies of the invention and reagents required forbinding of the antibody to a protein of the invention in a sample.

The methods and kits of the present invention may be used to detectLafora's disease. Samples which may be tested include bodily materialssuch as blood, urine, serum, tears, saliva, feces, tissues, cells andthe like. In addition to human samples, samples may be taken frommammals such as non-human primates, etc.

Before testing a sample in accordance with the methods described herein,the sample may be concentrated using techniques known in the art, suchas centrifugation and filtration. For the hybridization and/or PCR-basedmethods described herein, nucleic acids may be extracted from cellextracts of the test sample using techniques known in the art.

B. Therapeutic Applications

As mentioned previously, the nucleic add molecules of the presentinvention are deleted or mutated in people with Lafora's disease.Accordingly, the present invention provides a method of treating orpreventing Lafora's disease by administering a nucleic acid sequencecontaining a sufficient portion of the EPM2A gene to treat or preventLafora's disease.

Recombinant molecules comprising a nucleic acid sequence or fragmentthereof, may be directly introduced into cells or tissues in vivo usingdelivery vehicles such as retroviral vectors, adenoviral vectors and DNAvirus vectors. They may also be introduced into cells in vivo usingphysical techniques such as microinjection and electroporation orchemical methods such as coprecipitation and incorporation of DNA intoliposomes. Recombinant molecules may also be delivered in the form of anaerosol or by lavage.

The nucleic acid sequences may be formulated into pharmaceuticalcompositions for adminstration to subjects in a biologically compatibleform suitable for administration in vivo. By “biologically compatibleform suitable for administration in vivo”, is meant a form of thesubstance to be administered in which any toxic effects are outweighedby the therapeutic effects. The substances may be administered to livingorganisms including humans, and animals. Administration of atherapeutically active amount of the pharmaceutical compositions of thepresent invention is defined as an amount effective, at dosages and forperiods of time necessary to achieve the desired result. For example, atherapeutically active amount of a substance may vary according tofactors such as the disease state, age, sex, and weight of theindividual, and the ability of antibody to elicit a desired response inthe individual. Dosage regima tray be adjusted to provide the optimumtherapeutic response. For example, several divided doses may beadministered daily or the dose may be proportionally reduced asindicated by the exigencies of the therapeutic situation.

The active substance may be administered in a convenient manner such asby injection (subcutaneous, intravenous, etc.), oral administration,inhalation, transdermal application, or rectal administration. Dependingon the route of administration, the active substance may be coated in amaterial to protect the compound from the action of enzymes, acids andother natural conditions which may inactivate the compound.

The compositions described herein can be prepared by per se knownmethods for the preparation of pharmaceutically acceptable compositionswhich can be administered to subjects, such that an effective quantityof the active substance is combined in a mixture with a pharmaceuticallyacceptable vehicle. Suitable vehicles are described, for example, inRemington's Pharmaceutical Sciences (Remington's PharmaceuticalSciences, Mack Publishing Company, Easton, Pa., USA 1985). On thisbasis, the compositions include, albeit not exclusively, solutions ofthe substances in association with one or more pharmaceuticallyacceptable vehicles or diluents, and contained in buffered solutionswith a suitable pH and iso-osmotic with the physiological fluids.

C. Experimental Models

The present invention also includes methods and experimental models forstudying the function of the EPM2A gene and Laforin protein. Cells,tissues and non-human animals that lack the EPM2A gene or partially lackin Laforin expression may be developed using recombinant expressionvectors having a specific deletion or mutation in the EPM2A gene. Arecombinant expression vector may be used to inactivate or alter theEPM2A gene by homologous recombination and thereby create an EPM2Adeficient cell, tissue or animal.

Null alleles may be generated in cells, such as embryonic stem cells bydeletion mutation. A recombinant EPM2A gene may also be engineered tocontain an insertion mutation which inactivates EPM2A. Such a constructmay then be introduced into a cell, such as an embryonic stem cell, by atechnique such as transfection, electroporation, injection etc. Cellslacking an intact EPM2A gene may then be identified, for example bySouthern blotting Northern Blotting or by assaying for EPM2A using themethods described herein. Such cells may then be fused to embryonic stemcells to generate transgenic non-human animals deficient in EPM2A.Germline transmission of the mutation may be achieved, for example, byaggregating the embryonic stem cells with early stage embryos, such as 8cell embryos, in vitro; transferring the resulting blastocysts intorecipient females and; generating germline transmission of the resultingaggregation chimeras. Such a mutant animal may be used to definespecific cell populations, developmental patterns and in vivo processes,normally dependent on EPM2A expression. The present invention alsoincludes the preparation of tissue specific knock-outs of the EPM2Agene.

The following non-limiting examples are illustrative of the presentinvention:

EXAMPLES Example 1

Characterization of EPM2A

Materials and Methods

Patients. The diagnosis of Lafora's disease in patients with teenageonset progressive myoclonus epilepsy was confirmed by demonstration ofLafora bodies in skin, liver, muscle or brain biopsies (6-9) in at leastone affected member from each of 38 families included in this study.

Physical mapping. Using mapping data available from the WhiteheadInstitute/MIT Genome Center (http://mit-genome.wi.mit.edu/) as well asby identifying additional clones it was possible to establish anoverlapping set of yeast artificial chromosome (YAC) clones betweenD6S1003 and D6S311. A total of 136 markers (12 genes, 41 ESTs, and 83STSs/probes) were assayed against the YAC contig and 32 of these werefound to be in the EPM2A critical region (FIG. 1). We also isolated 129P1-derived artificial chromosomes (PACs) which cover an estimated 90% ofthe region between D6S1003 and D6S311 and have aligned the PACs by probecontent, restriction mapping, as well fingerprint analysis. Informationon all DNA markers can be found at the Genome DataBase(http://www.gdbwww.gdb.org/) or the Sanger Genome Center WWW site(http://www.sanger.ac.uk/HGP/Chr6/).

FIG. 1 illustrates the physical map of the Lafora's disease criticalregion. (A). A yeast artificial chromosome (YAC) contig was establishedcovering the 1.5 Mb critical region between D6S1003 and D6S311. Thepresence of a DNA marker on a YAC clone is shown by a correspondingvertical bar. The markers that are highlighted with a circle and asquare represent genetic markers or ESTs, respectively, while theraining ones are unique landmarks (STSs). The region between D6S1003 andD6S1042 that demonstrated an extended region of homozygosity in affectedmembers of a previously uncharacterized family is shown by a thickerhorizontal bar and this is the new EPM2A critical region (see FIG. 2A);(B). A P1-derived artificial chromosome (PAC) map encompassing theimmediate region surround D6S1703. The extent of the deletion could bedefined by PCR analysis of mapped STSs (see FIG. 2B). LDCR4 represents atranscript of unidentified function and EPM2A is the Lafora diseasegene. Since the 5′-end of this gene is not yet known it is representedwith a dashed line.

Northern blots, cDNA library screening, and RACE. Multiple-tissue (cat.#7760-1) and Human Brain II (cat. #7755-1) Northern blots were purchasedfrom Clontech and hybridization was carried out as recommended by thesupplier. The transcript A specific probe was generated using PCRprimers 266F (SEQ ID NO: 11) (5′-CGGCACGAGGATTATTCAAG-3′) and GSP3(5′-GCTCGGGTACTGAGGTCTG-3′) (SEQ ID NO: 12) which amplified an 190 bpfragment from cDNA clone 266552; (FIG. 3). The transcript B specificprobe was derived using PCR primers AA490925F(5′-AGTTGTTACACAGGGTTGTTGG-3′) (SEQ ID NO: 13) and AA490925R(5′-AGGCTGTACATCAGACAGAAGG-3′) (SEQ ID NO: 14) which amplified an 373 bpsegment from cDNA SFB14 (FIG. 3). We have sequenced the HTF-island shownin FIG. 1B at the 5′-end of EPM2A.

Genotyping. Haplotypes for 6q23-25 were constructed for all familymembers using microsatellite markers at loci D6S314, D6S1704, D6S1003,D6S1010, D6S1049, D6S1703, D6S1042, D6S1649, D6S978, D6S311 and D6S1637.Primer sequences from Genethon or from the Cooperative Human LinkageCentre. PCR conditions have been reported previously (13). PCR productswere separated on polyacrylamide gels. In 8 families (20%), haplotypeanalyses revealed evidence against linkage to 6q23-25. Of the remaining30 LD families 16 reported a history of consanguinity Thirty-one ofthese families have been described previously (refs. 12, 13, 25, 25).

Mutation Analysis. Mutations were detected by radioactive cyclesequencing using the Thermosequenase Kit (Amersham Life Science) withQiagen column purified PCR products. The combinations of PCR primerpairs used were JRGXBCF (5′-TCCATTGTGCTAATGCTATCTC-3′) (SEQ ID NO: 15)and JRGXBCR (SEQ ID NO: 16) (5′-TCAGCTTGCTTTGAGGATATTT-3′); product size310 bp, 824F (5′-GCCGAGTACAGATGCTGCC-3′ (SEQ ID NO: 17) and 824R (SEQ IDNO: 18) (5′-CACACAGTCCTTTCAGTTCAGG-3′); product size 384 bp, and H1F(5′-GAATCTCTTTCCACTTTGC-3′ (SEQ ID NO: 7) and 824R; product size 587 bp.The position of the primers are shown in FIG. 3.

Characterization of Lafora's Disease Gene To characterize the extent ofthe homozygous deletion in the affected in LD-L4 a P1-derived artificialchromosome (PAC) contig extending outwards from D6S1703 was constructed.It could be determined that the deletion encompassed approximately 50 kband that it did not interrupt directly the LDCR4 transcription unit(FIG. 1B). PAC clones 365C1, 466P17 and 28H5 (which encompassed thedeletion) were sequenced in order to identify new candidatetranscription units (FIG. 1B). A segment of DNA (E42) located within thedeletion detected a single EST (done 743381) in the database (FIG. 3).DNA sequencing of this cDNA indicated it contained a segment of identitywith one other EST (266552). This EST, however, was aligned previouslywith others into separate groups (or Unigenes named Hs.22464 andHs.112229). Subsequently, we used clone 743381 and 824559 and PCRprimers derived from their sequence for screening of multiple cDNAlibraries in an attempt to clone the entire coding region of this gene.

FIG. 2 shows a refined mapping of the Lafora disease gene. (A) Pedigreesand genotype data are provided for Lafora family LD39. Individualsaffected (solid) or unaffected (open) with Lafora disease are indicated.Below each individual is the corresponding genotype data (the markersare listed in their order from centromere (top) to telomere (bottom) asdetermined using the physical map shown in FIG. 1). The boxed segmentsof the haplotypes indicate regions of homozygosity. The loci in boldindicate the previous LD critical region. (B) Detection of 2 markers(D6S1703 and 109F4.E05.5) determined to be absent by PCR in the affectedmembers of the consanguineous Lafora family LD-L4.

FIG. 3 shows overlapping cDNA clones aligned with genomic DNA segments.The portions of each cDNA clone for which there was sequence isrepresented with a box. The corresponding genomic fragments are shown asstippled boxes below. The clones preceded with an (E) and (H) representEcoR1 and HindIII fragments, respectively. The positions of the primersused for mutation screening are shown as is the site of the phosphatasedomain and the stop codon (*).

Through analysis of the alignment of the DNA sequences of all of the ESTclones as well as the newly identified cDNAs, at least 4 putative typesof transcripts that corresponded to EPM2A could be defined (namedtranscript A, B, C, and D (FIG. 3). The cDNAs grouped into transcript Acould be categorized based on regions of sequence identity at their3′-ends. A consensus sequence was compiled and it was found to bedistributed amongst 4 exons spanning approximately 130 kb (FIGS. 1A and3). A single cDNA (266552) representing transcript B shared exactidentity with transcript A except for the omission of a 1,700 bp segmentdue to splicing (FIGS. 3 and 4). By comparing the corresponding genomicregions to the cDNAs a common origin for transcript A and B could beverified suggesting they are alternative forms of the same gene, thegene-products, of which, would be predictedto have uniquecarboxyl-terminal amino acid sequences (FIG. 4B).

FIG. 4shows the nucleotide sequence of cDNA encoding the EPM2A genetogether with the predicted amino acid sequence. (A) The consensusnucleotide sequence was derived from the cDNA clones 266552, RACE-A,RACE-B, RACE-C, and RACE:-D shown in FIG. 3. The position of themutations identified are indicated. The (*) indicates a stop mutationsite and the position of 2 known splice junctions is shown by thehorizontal arrows. An A to T polymorphism which is present inapproximately 40-50% of the population is shown; (B) the deduced Cterminus of transcript A compared with transcript B. The latter arisesdue to the removal by splicing of nt 738-2508 (FIG. 3 and FIG. 4A),which would be predicted to generate an isoform with a unique 3′ end. Atthe present time, transcript B is known to extend to position 94 of thepredicted amino acid sequence shown (FIG. 4A). Transcript C (cDNA SFB14)is described elsewhere (C), the putative PTP active sites of EPM2A,MTM1, PTEN, PTP18, dPTP61F and viral PTP. The shaded amino acids (C andR) represent catalytic residues. On the basis of sequence analysisalone, laforin predicts an intracellular PTP with dual specificityphosphatase activity.

The inventors determined a partial map (FIG. 3) and sequenced thecorresponding genomic regions that contained nucleotide identity tothese segments to prove their common origin. The results suggest thattranscript A, B, C and D are indeed alternatively spliced forms of thesame gene. The consensus sequence presently compiled for transcript Awas distributed amongst at least 4 exons spanning greater than 50 kbwhile transcript B was represented as a contiguous segment of DNA. Asingle EST clone, 743381, which represents another alternatively splicedform that appeared to be most common to transcript A was also identified(FIG. 3A). It contained at least 8 exons (FIG. 3) but a significant openreading frame was not detected. The newly identified gene, EPM2A, whichencodes Laforin, was the only one determined to be deleted in familyLD-L4 (FIG. 1)

Two other single cDNA clones, SFB14 and 743381, which could representadditional alternative forms of EPM2A, were also identified (FIG. 3).SFB14 was contiguous to genomic DNA and identical to the 3′-end oftranscript A except it's open reading frame (ORF) was predicted toextend 48 amino acids 5′ into the last intron shown in FIG. 3. Clone743381 contained 8 exons with appropriate exon-intron boundaries (FIG.3) but its significance could not be assessed due to the lack ofcontinuous open reading frame.

In addition to the essential cysteine and arginine residues found in allPTPs FIG. 4C), EPM2A contains an aspartic acid positioned 31 residuesamino-terminal of the cysteine nucleophile. This amino acid is importantfor catalysis as it is located on a loop that undergoes conformationalchange when substrate is bound to enzyme.

The corresponding mRNA for EPM2A was determined to be 3200 nucleotidesin length in multiple tissues based on RNA gel-blot hybridizationexperiments. FIG. 5 shows RNA expression pattern of the Laforin gene.Northern blot analysis in different tissues as indicated at the top. Theprobes used are described in the Materials and Methods and the exposuretime was 4 days at −80° C. The EPM2A message is observed in all tissuestested and the apparent overexpression in heart and skeletal muscle isdue to overloading of mRNA in these lanes as was seen when using anygene-specific probe. The results of FIG. 5 illustrate that stronghybridization signals were detected in skeletal muscle RNA and clearsignals were also seen in heart, brain, placenta, lung, liver, kidneyand pancreas. In addition, the same size mRNA was detected incerebellum, cerebral cortex, medulla, spinal cord, occipital pole,frontal lobe, temporal lobe, and putamen. Identical results showing thesame 3200 nucleotide message and tissue distribution were observed whena DNA probe believed to be specific for each isoform of the gene basedon the established consensus sequences, was used. For example, a probederived from the 3′-UTR region of transcript B of EPM2A was determinedunequivocally to be specific for this isoform. For transcript A, theprobe was generated from the unique region shown in FIG. 4A and RT-PCRexperiments seemed to confirm the specificity of this fragment (datanot, shown). On the basis of northern-blot results and the relativenumber of ESTs identified, it is probable that transcript A representsthe major isoform of EPM2A, and that it corresponds to the 3.2kb mRNA.From the analysis of the genomic DNA sequence, we have identified anadditional ORF at the HTF-island (FIG. 3). As this predicted exon hasall the proposed features of the consensus sequence of a eukaryotictranslation initiation site, and 113 nt of it are represented in theconsensus cDNA sequence, it could represent the 5′ end of EPM2A.

The protein encoded by EPM2A contains an amino acid motif (FIG. 1C) thatcorresponds with the consensus sequence (SEQ ID NO: 22), HcxxGxxRS(T),of the catalytic site of PTPs. In addition to the essential cysteine andarginine residues found in all PTPs (FIG. 4C), EPM2A contains theexpected aspartic acid necessary for completion of the catalyticreaction, positioned 31-aa N terminal of the cysteine nucleophile.

In an attempt to isolate the remainder of the coding region for thesetranscripts we performed multiple rounds of 5′-RACE on total brain andpoly(A)+ mRNA which has allowed us to extend transcript A (but nottranscript B) further. Beyond the most 5′-sequences shown in FIG. 4,however, all of the RACE clones recovered seemed to share the expectedDNA sequences but then diverged in different ways that did not allow fora common consensus to be established. However, comparative DNA sequenceanalysis of the human EPM2A gene its corresponding mouse homolog (alsocalled EPMA) confirmed the full length gene sequence as shown in FIG.13.

The deduced amino acid sequence of the newly identified protein(s),indicated that transcripts A, B, C and D encode a 9 amino acid motif(FIG. 4A) that corresponds exactly to the consensus sequence (SEQ ID NO:22), HCxxGxxRS(T), of the active catalytic site of protein tyrosinephosphatases (PTPs) (14,15). So far, no other structural motifs could beidentified, and from the sequence it is not apparent if this proteinbelongs to the receptor-like PTPs, the intracellular PTPs, or the dualspecificity phosphatases (DSPs) which dephosphorylate both tyrosine andserine/threonine residues (16). The identification of the EPM2A gene asa putative PTP provides the first clue to understand the basic defect.

At the HTF-island shown in FIG. 3, we have identified through GAILanalysis (http://compbio.ornl.gov) an additional putative exon 189nucleotides in length. An ATG (AUG) triplet is present at the beginningof this predicted ORF and the nucleotide sequence (SEQ ID NO: 23)surrounding the consensus sequence (CCCGCCAUGC) has the proposedfeatures of the consensus sequence (SEQ ID NO: 24) (GCCA/GCCAUGG) of aeukaryotic translation initiation site (12). The predicted start exonmaintains open reading frame with the most 5′ sequence of transcript Aand this combined, stretch of 298 nucleotides contains exon/intronjunction sequences with splice sites that confirm with the consensus inother mammalian genes. If the predicted axon is part of EPM2A,transcript A would be predicted to be 317 amino acids long.

Example 2

EPM2A Mutations

Using the available genomic structure for the gene, the inventors'screened an affected member from each of 30 Lafora families formutations by direct DNA sequencing. A total of 14 mutations weredetected consisting of 12 different DNA sequence alterations and 2microdeletions. The mutations are summarized in Table 3. The mutationfrom C to A at position −12 refers to a mutation that occurs 12 basesupstream from the ATG start codon in FIG. 13. Some of the sequenceupstream of the ATG is as follows: (SEQ ID NO: 19) . . .gccgggtattcgcgccgCcgccgcccgccATG . . . The mutation site at −12 isindicated with a capital C. To date, mutations have been found in 65% ofEPM2A families. Some of the mutations are discussed below.

Two mutations that, based on the current consensus sequences werespecific for transcript A, could be detected. Family LD-5 contained ahomozygous C to T point mutation which resulted in an arginine tocysteine change affecting a region of unknown function. To test for thepresence of the C to T point mutation in family LD-5 in the unaffectedpopulation PCR was completed on 54 samples (108 chromosomes) usingJRGXBF and JRGXBR primers and the product was blotted in duplicate. Onemembrane was hybridized with a wild type oligonucleotide(ATCATGACCGTTGCTGTAC) (SEQ ID NO: 20) and the other with LD5 mutant(TCATCATGACTGTTGCTGTAC) (SEQ ID NO: 21) oligonucleotide at 42° C(washing with 5×SSC at room temperature for 20 minutes followed by 2×SSC20 minutes at 65° C.). No mutant alleles were found.

The inventors have screened 100 normal chromosomes for this change andno mutant alleles were found. In family I-22 a homozygous G to Tnon-sense change in a region specific to transcript A would predictpremature termination of the EPM2Av protein. In sequences common to bothisoforms the inventors detected in the consanguineous family EPM2A00-4,a homozygous insertion of an A which would result in a frameshift thatwould cause an interruption of the tyrosine phosphatase domain. Theinventors have identified in 4 consanguineous families a homozygousnonsense mutation which results from a C to T change which causes theintroduction of a premature stop codon just preceding the tyrosinephosphatase domain. This same nonsense mutation was found on onechromosome of one additional family (L6) while the other chromosome hada G to A change resulting which results in a glycine to serinenon-conservative substitution Finally, in family LD33 an A to Ttransition results in a glutamine to leucine change in a residue locatedjust after the tyrosine phosphatase domain near the carboxy terminus.This mutation, apparently the mildest found, occurs in a family withrelative preservation of mental functions and a relatively protractedcourse (13). The five families having the C to T change are all ofSpanish decent indicating this may be the common mutation in this ethnicbackground.

FIG. 6 shows representative mutations found in 2 Lafora's diseasefamilies. The left, middle, and right panels show the in-frame sequenceof 5 codons surrounding an unaffected non-EPM2A carrier sibling, aEPM2A-carrier parent, and an affected EPM2A individual, respectively.(A) Family LD-16 in which a homozygous C to T transversion results inthe introduction of a stop mutation, and (B) Family LD-33 in which ahomozygous missense results in a glutamine to cysteine change.

The unraveling of the aetiopathogenesis of Lafora's disease needs toinclude an understanding of the formation of the pathognomonic Laforabodies. These unique structures have been found in LD patients in thesame tissues in which we have observed EPM2A expression (6-8, and FIG.5). Polyglucosans are unbranched equivalents of glycogen (10).Polyglucosan bodies resembling and sharing common antigenicity withLafora bodies have been found in glycogen storage disease type IV(Andersen disease) and in the normal corpora amylacea of aged brains(17). Andersen disease has been shown to arise due to mutations in thea-1,4 glucan gene on chromosome 3 which codes for the glycogen branchingenzyme (18). It is possible that mutations in a gene that lead to thelack of production of the Laforin tyrosine phosphatase protein couldaffect the metabolism of glycogen. Both glycogen biosynthesis andbreakdown are heavily regulated by phosphokinases and phosphatases (14).

EPM2A has at least two alternate forms (as does MTM1) which appear toencode protein isoforms that might be predicted to have differentfunctions or subcellular localizations in a manner analagous to theDrosophila PTP, dPTP61F, which also undergoes alternative splicing atthe 3′ end (24 ,21). In the case of dPTP61F, it is known that thealternate carboxy termini govern the localization of the protein toeither the cytoplasmic membrane or to the nucleus (24).

Although it seems that the accumulations in Lafora bodies areresponsible for neuronal death in Lafora's disease, it is not clearwhether the epilepsy is secondary to neurodegeneration or is a directresult of abnormal neuronal Laforin expression. In various models, bothsynaptic transmission and key components of neuronal excitability suchas the NMDA type of voltage-gated calcium channels appear to be subjectto phosphoregulation (19,20).

With 75 of 500 different potential DSPs and PTPs discovered so far, thisevolving family of phosphatases is likely to have as diverse and asimportant a role in various regulatory processes as its counterpartfamily of protein tyrosine kinases. Biological functions attributed tothese proteins so far include regulation of neuronal adhesion, controlof axonal pathfinding, regulation of growth factor, cytokine andoligomeric receptor signaling and dephosphorylation of MAP Kinases(MAPKs) and other roles in tumor suppression (16). Involvement ofmembers of this phosphatase family in non-neoplastic diseases has beenfound in only one other human disorder, namely X-linked myotubularmyopathy (21). In this disease, mutations of the DSP MTM1 result in anarrest of muscle maturation in utero after a period of normaldevelopment (22).

Laforin is the first member of the family of PTPs and DSPs to betinvolved in human central nervous system disease. Further investigationwill be necessary to understand its role in normal brain, in theformation of Lafora bodies and in Lafora's disease and its epilepsy.

Example 3

Summary of Common EPM2A Mutations

Patients and methods

Patients reported here had biopsy-proven Lafora's disease. Polymerasechain reaction (PCR) primer sequences and conditions were: JRGXBF: (SEQID NO: 9) 5′-TCCATTGTGCTAATGCTATCTC-3′, JRGXBR: (SEQ ID NO: 10)5′-TCAGCTTGCTTTGAGGATATTT-3′, H1F: (SEQ ID NO: 7)5′-GAATGCTCTTTCCACTTTGC-3, PTPR: (SEQ ID NO: 8)5′-GGCTCCTTAGGGAAATCAG-3′; Annealing: 62°; [MgC12]=1.25 mM. Stock DNAwas used; PCR products: were purified on Qiagen columns. Restrictiondigests were performed at 37°, and products were run on 3% agarose gels.

Results

Mutations

EPM2A is composed of 4 exons located within a ˜130,000 bp span of °Chromosome 6q24. FIG. 11 shows a refined map of the deletion breakpointsin families LD-L4, LD9 and LD1. Filled symbols indicate patients withLD. Open rectangles on the map are the exons of EPM2A. Genomic structurearound exons 1 and 2 is shown to scale. PCR markers 365C1.H65, 266B13,D6S1703A, JRGBF/R, LDXDF/R, 109F4.E.05 and dj28H5T7 were tested. Primersequences can be obtained by looking up PAC 466P17 athttp://www.sanger.ac.uk. The positions of the forward primers of thesemarkers on the PAC are at: 58336, 59869, 98214, 108805, 123524, 124039and 132487 bp respectively. The maximum extent of the deletions areshown on the right The deletion breakpoint regions for LD-L4 and LD9 arecoloured black on the map and are distinct from the deletion breakpointregions for LD1 are coloured grey. Each of the four deletion breakpointscontains a MIR repeat

As a first step towards screening exon 2 for mutations, it was amplifiedby PCR with primers JRGXBF and JRGXBR. In the affected members fromthree families, LD-l4, LD9 and LD1, no PCR product was observedindicating a possible homozygous deletion in these patients. In order toconfirm and characterize the extent of this deletion, PCR was performedwith primers covering the rest of the gene (FIG. 11). The extent of thedeletion in families LD-L4 and LD9 was determined to be ˜75,000 bpencompassing both exons 1 and 2. A smaller deletion of ˜25,000 bp wasfound in family LD1.

Screening Tests for the More Common Mutations

FIG. 12 shows restriction endonuclease screening for the two commonmutations in exon 4. (A) Restriction map (to scale) of PCR product withprimers H1F/PTPR. H, HaeIII restriction enzyme sites one of which isdestroyed by the C→T mutation; boxed P, PstI site created by the G→Amutation. (B) HaeIII and PstI digestion of the H1F/PTPR PCR product.Lane 1, 1 Kb ladder, lane 2: normal non-carrier individual with HaeIIIdigestion, lanes 3 and 4: appearance of an abnormal 199 bp band in acarrier with the C→T mutation (lane 3) and a patient with a homozygousmutation (lane 4); lane 5: PST1 digestion does not affect normalnon-carriers, lane 6: PstI digests the PCR product into two smallerfragments in a carrier of the G→A mutation. In patients with ahomozygous G→A mutation PSTI digestion should result in thedisappearance of the 520 bp original band. However, we presently do nothave such a patient in our data set.

The most common EPM2A mutation to date is a C→T nonsense mutation of thesecond base pair of exon 4 observed in 9 families (Table 2). Primers H1Fand PTPR amplify a 520 bp DNA fragment encompassing exon 4 and includingseveral recognition sites for the restriction enzyme HaeIII, one ofwhich is destroyed by the C→T mutation. Digestion of this PCR productwith HaeIII in normal non-carrier individuals results in nine smallbands the largest of which is 102 bp. Digestion with HaeIII in carriersor patients results in the appearance of an abnormal 199 bp band (FIGS.12A and 12B). Carriers cannot be distinguished from patients who carrythis mutation on both chromosomes using this test (FIG. 12B).

The second most common mutation is a G→A mutation of bp 115 in exon 4observed in 4 families (Table 2). This mutation creates a unique PstIrestriction, site in the sequence of the HIF/PTPR PCR product. PstI doesnot digest this 520 bp PCR product in normal non-carrier individuals.Carriers will therefore have one normal 520 bp band and two variantbands of 195 bp and 315 bp (FIGS. 12A and 12B). Patients homozygous forthis mutation will only have the abnormal bands.

Finally, several families with deletions of EPM2A have been described inTable 2. Two of these families (LD-L4 and LD9) appear to have identical˜75 Kb deletions (FIG. 11), which are different from the other two(Table 2). Nonetheless these three different deletion mutations allencompass exon 2 (FIG. 11, Table 2). Patients homozygous for any ofthese deletions can be picked up by the absence of PCR amplificationusing primers JRGXBF/JRGXBR and appropriate controls (FIG. 11).

Discussion

LD is most frequently diagnosed in societies with high rates ofconsanguinity. There also seems to be an excessive reporting fromcountries surrounding the Mediterranean basin, and many of thosefamilies appear not to be consanguineous. This initially suggested thatlike other PMEs such as Unverricht-Lundborg disease (27) or the NeuronalCeroid Lipofuscinoses (28), LD might be caused by a common mutation inmost cases. This was shown not to be the case. The large number ofdifferent mutations renders their detection for clinical purposesdifficult.

The simple DNA-based tests described above can be used to screen for thethree more common mutations in the following fashion. Digestion of theHIF/PTPR PCR product with HaeIII and PstI detects the two more commonmutations and will confirm that an individual is a carrier of one or theother mutation. The PstI test can further establish whether a patient orfetus is homozygous for the G→A mutation. In order to establish if apatient is homozygous for the mutation detected by the HaeIII test,further analyses will be required such as allele specificoligonucleotide hybridization or DNA sequencing.

PCR using JRGXBF/JRGXBR will detect the deletion mutations described inthis report, but only in homozygous state. This simple test cantherefore serve for prenatal or symptomatic diagnosis, but cannot detectcarriers. For carrier testing in these families further work will berequired. For example, in three of the deletions (LD-L4, LD9), thepolymorphic microsatellite marker D6S1703 is encompassed in the deletionand can be used to detect carriers by testing for loss ofheterozygosity.

The C→T mutation appears to be common in patients of Spanish (orIberian) origin (Tables 1 and 2). The ˜75 Kb deletion was observed intwo of two Arabic families in our data set (LD-L4 and LD9).Parenthetically, LD9 is the same Arabic family described in reference 29in which two affected siblings had discordant biopsy results. Whilefalse negative biopsies are usually due to insufficient sampling and/orbiopsies done early in the course of the disease, genetic testing shouldnot have these limitations.

Additional EPM2A mutations remain to be found as presently we haveidentified mutations in only 65% of families. Furthermore, we haverecently shown that an altogether different gene other than EPM2A causesLD in up to 20% of patients including the families from the FrenchCanadian province of Quebec (30). These patients are clinically andpathologically indistinguishable from those with EPM2A mutations (30).

Two deletions with different deletion breakpoints are described in thisExample. Interestingly, analysis of the sequences of the breakpointregions revealed the presence of the mammalian-wide interspersed repeat(MIR) (31) in all four breakpoint regions (FIG. 11). Duplicated orrepetitive sequences flanking deleted genes or exons of a gene have beenimplicated in the generation of such deletions due to unequalrecombinations. A well-studied example of this from the neurologicalliterature is Hereditary Neuropathy with Liability to Pressure Palsies.The putative mechanism in that deletion is complex involving a largemariner repeat which codes for a transposase that might facilitate therecombination (32). The role, if any, of the short MIR, repeats in thegeneration of the deletions in our LD patients is now underinvestigation.

In conclusion, the inventors have identified new EPM2A deletionmutations and described DNA-based screening tests for the detection ofthe more common EPM2A mutations. Further mutations in EPM2A and in theyet unidentified second gene, EPM2B, will improve the role of genetictesting and will provide insights into the function of the gene productlaforin and the pathogenesis of LD.

While the present invention has been described with reference to whatare presently considered to be the preferred examples, it is to beunderstood that the invention is not limited to the disclosed examples.To the contrary, the invention is intended to cover variousmodifications and equivalent arrangements included within the spirit andscope of the appended claims.

All publications, patents and patent applications are hereinincorporated by reference in their entirety to the same extent as ifeach individual publication, patent or patent application wasspecifically and individually indicated to be incorporated by referencein its entirety.

FULL CITATIONS FOR REFERENCES REFERRED TO IN THE SPECIFICATION

1. Delgado-Escueta, A. V., Wilson, W. A., Olsen, R. O., Porter, R. J.Jasper's Basic Mechanisms of the Epilepsies (Lippincott-RavenPublishers, 1998) Chapter 1.

2. Berkovic, S. F., Andermann, F., Carpenter, S., and Wolfe, L. S. 1986;Progressive myoclonus epilepsies: specific causes and diagnosis. NewEng. J. Med. 315, 296-305.

3. Minassian, B. A., Sainz, J. and Delgado-Escueta, A. V. 1996. Geneticsof Myoclonic and Myoclonus epilepsies. Clin. Neuroscience 3, 223-235.

4. Van Heycop Ten Ham MW. 1974. Lafora disease, a form of progressivemyoclonus epilepsy. Handbook of Clinical Neurology 15:382-422.

5. Minassian B A, Sainz J. Bohlega S, Sakamoto L M, Delgado-Escueta A V1996. Genetic heterogeneity in Lafora's disease. Epilepsia 37 suppl. 5,A126.

6. Lafora, G. R. 1911. Uber das vorkormmen amyloider korperchen iminnern der ganglienzellen; zugleich ein beitrag zum studium deramyloiden substanz im nervensystem. Virchows. Arch. Path. Anat., 205,295-303.

7. Harriman, D. G. and Millar, J. H. D. 1955. Progressive familialmyoclonic epilepsy in 3 families: its clinical features and pathologicalbasis. Brain 78, 325-349.

8. Schwarz, G. A. and Yanoff, M. 1965. Lafora's disease, distinctclinico-pathologic form of Unverricht's syndrome. Arch Neurol. 12,172-188.

9. Carpenter S and Karpati G. 1981. Sweat gland duct cells in Laforadisease: Diagnosis by skin biopsy. Neurol. 31:1564-1568.

10. Sakai, M., Austin, J., Witmer, F. and Trueb, L. 1970. Studies inmyoclonus epilepsy (Lafora body form). Neurol. 20, 160-176.

11. Carpenter, S., Karpati, G., Andermann, F., Jacob, J. C. andAndermann, E. 1974 Lafora's disease: peroxisomal storage in skeletalmuscle. Neurol. 24, 531-538.

12. Serratosa, J., Delgado-Escueta, A. V., Posada, I., Shih, S., Drury,I., Berciano, J., Zabala, J. A., Antunez, M. C. and Sparkes, R. S. 1995.The gene for progressive myoclonus epilepsy of the Lafora type maps tochromosome 6q. Hum. Molec. Genet 9, 1657-1663.

13. Sainz J., Minassian B. A, Serratosa J. M., Gee M. N., Sakamoto L.M., Iranmanesh R. Bohlega S., Baumann R. J., Ryan S., Sparkes R. S.,Delgado-Escueta A. V. 1997. Lafora progressive myoclonus epilepsy:narrowing the chromosome 6q24 locus by recombinations andhomozygosities. Am. J. Hum. Genet. 61(5):1205-1209

14. Denu J. M., Stuckey J. A., Saper M. A., Dixon J. E. 1996. Form andFunction in Protein dephosphorylation. Cell 87:361-364.

15. Yuvaniyama J., Denu J. M., Dixon J. E., Saper M. A. 1996. Crystalstructure of the dual specificity protein phosphatase VHR. Science272:1328-1331.

16. Tonks N. K., Neel B. G. 1996. From form to function: signaling byprotein tyrosine phosphatases. Cell 87:365-368.

17. Yokota T. Ishihara T. Yoshida H. Takahashi M. Uchino F. Hamanaka S.1988. Monoclonal antibody against polyglucosan isolated from themyocardium of a patient with Lafora disease. J. Neuropath. & Exp.Neurol. 47(5);572-7

18. Thon, V. J., Khalil, M. and Cannon, J. F. 1993. Isolation of humanglycogen branching enzyme cDNAs be screening complementation in yeast.J. Biol. Chem. 268, 7509-7513

19. Gurd J. W, Bissoon N. 1997. The N-methyl-D-aspartate receptorsubunits NR2A and NR2B bind to the SH2 domains of phospholipaseC-gammna. J. of Neurochem. Aug;69(2):623-30

20. Llinas R., Moreno H., Sugimori M., Mohammadi M., Schlessinger J.1997. Differential pre and postsynaptic modulation of chemicaltransmission in the squid giant synapse by tyrosine phosphorylation.Proc. Nat. Acad. Sciences (USA) 94(5):1990-1994

21. Laporte J., Hu L. J., Kretz C., Mandel J. L., Kioschis P., Coy J.F., Klauck S. M., Poustka A., Dahl N. 1996. A gene mutated in X-linkedmyotubular myopathy defines a new putative tyrosine phosphatase familyconserved in yeast. Nat. Genet 13(2):175-82

22. Cui X., DeVivo I., Slany R., Miyamoto A., Firestein R., Cleary M L.1998. Association of SET domain and myotubularin-related proteinsmodulates growth control. Nat. Genet. 18:331-337

23. M. Kozak, 1996, Mamm. Genome 7:563.

24. S. McLaughlin and J. E. Dixon, 1993. Biol. Chem., 268:6839.

25. I. Lopes Cendes et al., 1995, Epilepsia, 36:S6.

26. J. N. Acharya, P Satishchandra, S. K. Shankar S K. 1995, Epilepsia,36:429.

27. Lafreniere R G, Rochefort D L, Chretien N et al. Unstable insertionin the 5′ flanidng region of the cystatin B gene is the most commonmutation in progressive myodonus epilepsy type 1, EPM1. Nat Genet1997;15:298-302

28. Goebel H H. 7th International Congress on NeuronalCeroid-Lipofuscinoses (NCL-98), 13-16 Jun. 1998, Dallas, USA BrainPathol 1998;8:809-810

29. Drury I, Blaivas M, Abou-Khalil B W, Beydoun A. Biopsy results in akindred with Lafora disease. Arch Neurol 1993;50:102-105

30. Minassian B A, Sainz J, Serratosa J M et al. Genetic locusheterogeneity in Lafora's progessive myoclonus epilepsy. Ann Neurol1999;45:262-265

31. Smit A F and Riggs A D. MIRs are classic tRNA-derived SINEs thatamplifiedbefore the mammalian radiation. Nucleic Acids Res1995;23:98-102

32. Reiter L T, Murakari T, Koeuth T et al. A recombination hotspotresponsible for two inherited peripheral neuropathies is located near amariner transposon-like element Nature Genet 1996;12:288-297

TABLE 1 Summary of mutations. Mutation/ Predicted Family Genetics¹(primers used)² effect LD-L4 consanguineous homozygous deletion deletionof the (D6S1703 and majority of EPM2A 109F4.E0.5) LD100- consanguineoushomozygous insertion interruption of the 4 of A resulting in tyrosinephosphatase a frameshift domain (824F and 824R) I-22 consanguineoushomozygous mutation glutamic acid → stop G → T (JRGXBCF and JRGXBCFR)LD-33 consanguineous homozygous mutation glutamine → leucine A → T (824Fand 824R) LD-5 consanguineous homozygous mutation arginine → cysteine C→ T (JRGXBCF and JRGXBCFR) consanguineous 1. C → T (824R 1. arginine →stop L6 (compound and H1F) 2. glycine → serine heterozygote) 2. G → A(824F and 824R) LD-16 consanguineous homozygous mutation arginine → stopC → T (824R and H1F) LD15 consanguineous homozygous mutation arginine →stop C → T (824R and H1F) LD-48 consanguineous homozygous mutationarginine → stop C → T (824R and H1F) LD13 consanguineous homozygousmutation arginine → stop C → T (824R and H1F) ¹Families L6, LD-16, LD15,LD-48 and LD13 are of common ethnic background. ²The location of the PCRprimers and mutations are shown in FIGS. 3 and 4, respectively. LItalian heterozygous mutation **arginine to stop M G to A L Non- **onemutation **arginine to stop B consanguineous codon (one Bolivianchromesome) ethnicity

TABLE 2 Most common EPM2A mutations to date Mutation n* Ethnic Origin 1C->T nonsense 5 Spanish mutation of bp 2 of exon 4 2 1 Spanish, 1Italian 2 G->A missense mutation 1 Spanish of bp115 of exon 4 3 (a) ˜75kb deletion 2 Arabic (b) ˜25 kb deletion 1 Iranian Total = 17 *n is thenumber of families with corresponding mutation

TABLE 3 Nudeotide Position Amino Acid Change Mutation (FIG. 13) (FIG.14) C → T 721 Arg (241) → stop insert A 800 Premature stop G → A 836 Gly(279) → Ser C → T 163 Glu → Stop T → G  94 Trp (32) → Gly A → G 146 Asp(49) → Gly G → T 412 Glu (138) → stop A → T 878 Gln (293) → Leu Delete G235 Premature stop G → A 179 Trp (60) → stop C → T 322 Arg (108) → Cys C→ A −12 Deletion (75 kb) exons 1 and 2 Deletion (25 kb) exon 2

32 1 3128 DNA Homo sapiens 1 atgcgcttcc gctttggggt ggtggtgcca cccgccgtggccggcgcccg gccggagctg 60 ctggtggtgg ggtcgcggcc cgagctgggg cgttgggagccgcgcggtgc cgtccgcctg 120 aggccggccg gcaccgcggc gggcgacggg gccctggcgctgcaggagcc gggcctgtgg 180 ctcggggagg tggagctggc ggccgaggag gcggcgcaggacggggcgga gccgggccgc 240 gtggacacgt tctggtacaa gttcctgaag cgggagccgggaggagagct ctcctgggaa 300 ggcaatggac ctcatcatga ccgttgctgt acttacaatgaaaacaactt ggtggatggt 360 gtgtattgtc tcccaatagg acactggatt gaggccactgggcacaccaa tgaaatgaag 420 cacacaacag acttctattt taatattgca ggccaccaagccatgcatta ttcaagaatt 480 ctaccaaata tctggctggg tagctgccct cgtcaggtggaacatgttac catcaaactg 540 aagcatgaat tggggattac agctgtaatg aatttccagactgaatggga tattgtacag 600 aattcctcag gctgtaaccg ctacccagag cccatgactccagacactat gattaaacta 660 tatagggaag aaggcttggc ctacatctgg atgccaacaccagatatgag caccgaaggc 720 cgagtacaga tgctgcccca ggcggtgtgc ctgctgcatgcgctgctgga gaagggacac 780 atcgtgtacg tgcactgcaa cgctggggtg ggccgctccaccgcggctgt ctgcggctgg 840 ctccagtatg tgatgggctg gaatctgagg aaggtgcagtatttcctcat ggccaagagg 900 ccggctgtct acattgacga agaggccttg gcccgggcacaagaagattt tttccagaaa 960 tttgggaagg ttcgttcttc tgtgtgtagc ctgtagctggtcagcctgct tctgccccct 1020 cctgatttcc ctaaggagcc tgggatgatg ttggtcaaatgacctagaaa caaggattct 1080 acctgaactg aaaggactgt gtgacctccc caagccaaccactttcacct gggatgactt 1140 tcgattatgc tttggtttgg ggctgtattt ttgaaatactctacaagaaa gctgtggctc 1200 aacacatgag aagaagcacg aagcagttag gctgtacatcagacagaagg gtaatgcgtg 1260 cagttcctgc tgcctgcagg cagacgaggc ctttgctttacagcactgta tgtgttgcac 1320 gatggatccg tgacagcact ttcctgttgc actgaaactcttggccatgt agaggaaaag 1380 atatggagtt atgtggattt catcactagt atgtgtgccgtgagctggtc agttgccaaa 1440 ggaggaaata aggttagaag cctgaaccgt tacaaaagaagagctcacta tggtcaaaaa 1500 gtgatggctt tcaggacttg ttttttatcc tgcctcacagttgttaaagt ctgttccaag 1560 gcatcacctt ccttctctac ccaacaaccc tgtgtaacaactaaagtaga attatctctc 1620 atttgttggt ggtttttcct caaaattacc aaacaaagcaaaaaataccc ttgtttttta 1680 tagttgagat gtcaaggaag ttaaattgag gcttaatgagcataggtagc ttgtccaagg 1740 tctcatgacc agtcaagggc aagctggagt taataatctatatttatttg actcagcact 1800 gttttcatca caacttgttt tcccagcatc atgtagtgcatttagttttg tctttctcag 1860 ggtatagtca atatgcctgc aggagtttct atagcgagacatagaatagt attctgatca 1920 gttgccaaag aatctaggaa attagttgta ttttgtgcaagctaatttaa aaacatgatg 1980 ggctgtttta agaccagagt ggaaattcat gagaggaactatactaccaa aagagcccaa 2040 atgaccaaat ccatggataa ttgcttcaca gccttggccatcctggctca gctctcaatt 2100 tagtataata tgcagttcct gtgcctccag actatgcagctcatcaccct aggttctaca 2160 ggaaatacag agatgaacaa ctttgccttc aaaaaatgtgctgcctagaa aacagacctg 2220 catttcaacc caactgtaat gcaggatttg gaccatgaatgatatgctag aatagaagaa 2280 agagaagtgt ttttttaatt gagagcctct atgtgcaaggtgatatataa tcatatccag 2340 tttaatcttc acaatatcca atgaagaagg tctcattatctccatgataa agatggggaa 2400 actaaggtca gaagggttaa ctcaactgtc tattgtcacatgatgaataa atagatgaag 2460 tgagatacaa agctgggttt gattcaaagc ccttactttcctaattaaac tatgatgcgt 2520 atttattttt ctgcaccttc ctttcttcca caaacacatattgatagatg caagagactc 2580 ttatttataa ggcgtggggg acaagaagga tacaaggtaagtttcagtgg agctcagagg 2640 acggggagat agaactgtgg cacttagggg agatgacatttgctttgggc agaggcagct 2700 agccaggaca catttccact ataattttac aaagttaaatttataagcta gcattaagta 2760 aagtgaagtc cagctccctt gctaaaaata actagaggtaataattggta ttcaggtaac 2820 tcatttacag tcataatgtg ttgtgaaaat ttaatcttaaaaattaaatt tttaaactat 2880 gtgggtctgt gaatttcttt aatgtctaag aaatccagcttcataatttc catgatacaa 2940 agatcttttt tcaggtggat ttttaccttt gttccttttgctctgataga caaaatcagt 3000 ttaggactat taaagaatgt tttggaataa actgtctttttcctcaatga atgggatgtc 3060 taatgtattt caaaatcacc caaaactttt ggcaaataaaagcatttaaa aagaaaaaaa 3120 aaaaaaaa 3128 2 331 PRT Homo sapiens 2 MetArg Phe Arg Phe Gly Val Val Val Pro Pro Ala Val Ala Gly Ala 1 5 10 15Arg Pro Glu Leu Leu Val Val Gly Ser Arg Pro Glu Leu Gly Arg Trp 20 25 30Glu Pro Arg Gly Ala Val Arg Leu Arg Pro Ala Gly Thr Ala Ala Gly 35 40 45Asp Gly Ala Leu Ala Leu Gln Glu Pro Gly Leu Trp Leu Gly Glu Val 50 55 60Glu Leu Ala Ala Glu Glu Ala Ala Gln Asp Gly Ala Glu Pro Gly Arg 65 70 7580 Val Asp Thr Phe Trp Tyr Lys Phe Leu Lys Arg Glu Pro Gly Gly Glu 85 9095 Leu Ser Trp Glu Gly Asn Gly Pro His His Asp Arg Cys Cys Thr Tyr 100105 110 Asn Glu Asn Asn Leu Val Asp Gly Val Tyr Cys Leu Pro Ile Gly His115 120 125 Trp Ile Glu Ala Thr Gly His Thr Asn Glu Met Lys His Thr ThrAsp 130 135 140 Phe Tyr Phe Asn Ile Ala Gly His Gln Ala Met His Tyr SerArg Ile 145 150 155 160 Leu Pro Asn Ile Trp Leu Gly Ser Cys Pro Arg GlnVal Glu His Val 165 170 175 Thr Ile Lys Leu Lys His Glu Leu Gly Ile ThrAla Val Met Asn Phe 180 185 190 Gln Thr Glu Trp Asp Ile Val Gln Asn SerSer Gly Cys Asn Arg Tyr 195 200 205 Pro Glu Pro Met Thr Pro Asp Thr MetIle Lys Leu Tyr Arg Glu Glu 210 215 220 Gly Leu Ala Tyr Ile Trp Met ProThr Pro Asp Met Ser Thr Glu Gly 225 230 235 240 Arg Val Gln Met Leu ProGln Ala Val Cys Leu Leu His Ala Leu Leu 245 250 255 Glu Lys Gly His IleVal Tyr Val His Cys Asn Ala Gly Val Gly Arg 260 265 270 Ser Thr Ala AlaVal Cys Gly Trp Leu Gln Tyr Val Met Gly Trp Asn 275 280 285 Leu Arg LysVal Gln Tyr Phe Leu Met Ala Lys Arg Pro Ala Val Tyr 290 295 300 Ile AspGlu Glu Ala Leu Ala Arg Ala Gln Glu Asp Phe Phe Gln Lys 305 310 315 320Phe Gly Lys Val Arg Ser Ser Val Cys Ser Leu 325 330 3 2940 DNA Homosapiens 3 ggtggagctg gcggccgagg aggcggcgca ggacggggcg gagccgggccgcgtggacac 60 gttctggtac aagttcctga agcgggagcc gggaggagag ctctcctgggaaggcaatgg 120 acctcatcat gaccgttgct gtacttacaa tgaaaacaac ttggtggatggtgtgtattg 180 tctcccaata ggacactgga ttgaggccac tggacacacc aatgaaatgaagcacacaac 240 agacttctat tttaatattg caggccacca agccatgcat tattcaagaattctaccaaa 300 tatctggctg ggtagctgcc ctcgacaggt ggaacatgtt accatcaaactgaagcatga 360 attggggatt acagctgtca tgaatttcca gactgaatgg gatattgttcagaattcctc 420 atgctgtaac cgctacccag agcccatgac tccagacact atgattaaactatctaggga 480 agaaggcttg gcctacatct ggatgccaac accagatatg agcaccgcaggccgagtaca 540 gatgctgccc caggcggtgt gcctgctgca tgcgctgctg gagaagggacacatcgtgta 600 cgtgcactgc aacgctgggg tgggccgctc caccgcggct gtctgcggctggctccagta 660 tgtgatgggc tggaatctga ggaaggtgca gtatttcctc atggccaagaggccggctgt 720 ctacattgac gaagaggcct tggcccgggc acaagaagat tttttccagaaatttgggaa 780 ggttcgttct tctgtgtgta gcctgtagct ggtcagcctg cttctgccccctcctgattt 840 ccctaaggag cctgggatga tgttggtcaa atgacctaga aacaaggattctacctgaac 900 tgaaaggact gtgtgacctc cccaagccaa ccactttcac ctgggatgactttcgattat 960 gctttggttt ggggctgtat ttttgaaata ctctacaaga aagctgtggctcaacacatg 1020 agaagaagca cgaagcagtt aggctgtaca tcagacagaa gggtaatgcgtgcagttcct 1080 gctgcctgca ggcagacgag gcctttgctt tacagcactg tatgtgttgcacgatggatc 1140 cgtgacagca ctttcctgtt gcactgaaac tcttggccat gtagaggaaaagatatggag 1200 ttatgtggat ttcatcacta gtatgtgtgc cgtgagctgg tcagttgccaaaggaggaaa 1260 taaggttaga agcctgaacc gttacaaaag aagagctcac tatggtcaaaaagtgatggc 1320 tttcaggact tgttttttat cctgcctcac agttgttaaa gtctgttccaaggcatcacc 1380 ttccttctct acccaacaac cctgtgtaac aactaaagta gaattatctctcatttgttg 1440 gtggtttttc ctcaaaatta ccaaacaaag caaaaaatac ccttgttttttatagttgag 1500 atgtcaagga agttaaattg aggcttaatg agcataggta gcttgtccaaggtctcatga 1560 ccagtcaagg gcaagctgga gttaataatc tatatttatt tgactcagcactgttttcat 1620 cacaacttgt tttcccagca tcatgtagtg catttagttt tgtctttctcagggtatagt 1680 caatatgcct gcaggagttt ctatagcgag acatagaata gtattctgatcagttgccaa 1740 agaatctagg aaattagttg tattttgtgc aagctaattt aaaaacatgatgggctgttt 1800 taagaccaga gtggaaattc atgagaggaa ctatactacc aaaagagcccaaatgaccaa 1860 atccatggat aattgcttca cagccttggc catcctggct cagctctcaatttagtataa 1920 tatgcagttc ctgtgcctcc agactatgca gctcatcacc ctaggttctacaggaaatac 1980 agagatgaac aactttgcct tcaaaaaatg tgctgcctag aaaacagacctgcatttcaa 2040 cccaactgta atgcaggatt tggaccatga atgatatgct agaatagaagaaagagaagt 2100 gtttttttaa ttgagagcct ctatgtgcaa ggtgatatat aatcatatccagtttaatct 2160 tcacaatatc caatgaagaa ggtctcatta tctccatgat aaagatggggaaactaaggt 2220 cagaagggtt aactcaactg tctattgtca catgatgaat aaatagatgaagtgagatac 2280 aaagctgggt ttgattcaaa gcccttactt tcctaattaa actatgatgcgtatttattt 2340 ttctgcacct tcctttcttc cacaaacaca tattgataga tgcaagagactcttatttat 2400 aaggcgtggg ggacaagaag gatacaaggt aagtttcagt ggagctcagaggacggggag 2460 atagaactgt ggcacttagg ggagatgaca tttgctttgg gcagaggcagctagccagga 2520 cacatttcca ctataatttt acaaagttaa atttataagc tagcattaagtaaagtgaag 2580 tccagctccc ttgctaaaaa taactagagg taataattgg tattcaggtaactcatttac 2640 agtcataatg tgttgtgaaa atttaatctt aaaaattaaa tttttaaactatgtgggtct 2700 gtgaatttct ttaatgtcta agaaatccag cttcataatt tccatgatacaaagatcttt 2760 tttcaggtgg atttttacct ttgttccttt tgctctgata gacaaaatcagtttaggact 2820 attaaagaat gttttggaat aaactgtctt tttcctcaat gaatgggatgtctaatgtat 2880 ttcaaaatca cccaaaactt ttggcaaata aaagcattta aaaagaaaaaaaaaaaaaaa 2940 4 268 PRT Homo sapiens 4 Val Glu Leu Ala Ala Glu Glu AlaAla Gln Asp Gly Ala Glu Pro Gly 1 5 10 15 Arg Val Asp Thr Phe Trp TyrLys Phe Leu Lys Arg Glu Pro Gly Gly 20 25 30 Glu Leu Ser Trp Glu Gly AsnGly Pro His His Asp Arg Cys Cys Thr 35 40 45 Tyr Asn Glu Asn Asn Leu ValAsp Gly Val Tyr Cys Leu Pro Ile Gly 50 55 60 His Trp Ile Glu Ala Thr GlyHis Thr Asn Glu Met Lys His Thr Thr 65 70 75 80 Asp Phe Tyr Phe Asn IleAla Gly His Gln Ala Met His Tyr Ser Arg 85 90 95 Ile Leu Pro Asn Ile TrpLeu Gly Ser Cys Pro Arg Gln Val Glu His 100 105 110 Val Thr Ile Lys LeuLys His Glu Leu Gly Ile Thr Ala Val Met Asn 115 120 125 Phe Gln Thr GluTrp Asp Ile Val Gln Asn Ser Ser Cys Cys Asn Arg 130 135 140 Tyr Pro GluPro Met Thr Pro Asp Thr Met Ile Lys Leu Ser Arg Glu 145 150 155 160 GluGly Leu Ala Tyr Ile Trp Met Pro Thr Pro Asp Met Ser Thr Ala 165 170 175Gly Arg Val Gln Met Leu Pro Gln Ala Val Cys Leu Leu His Ala Leu 180 185190 Leu Glu Lys Gly His Ile Val Tyr Val His Cys Asn Ala Gly Val Gly 195200 205 Arg Ser Thr Ala Ala Val Cys Gly Trp Leu Gln Tyr Val Met Gly Trp210 215 220 Asn Leu Arg Lys Val Gln Tyr Phe Leu Met Ala Lys Arg Pro AlaVal 225 230 235 240 Tyr Ile Asp Glu Glu Ala Leu Ala Arg Ala Gln Glu AspPhe Phe Gln 245 250 255 Lys Phe Gly Lys Val Arg Ser Ser Val Cys Ser Leu260 265 5 915 DNA Homo sapiens 5 ccaagaatcg gcacgaggat tattcaagaattctaccaaa tatctggctg ggtagctgcc 60 ctcgacaggt ggaacatgtt accatcaaactgaagcatga attggggatt acagctgtca 120 tgaatttcca gactgaatgg gatattgttcagaattcctc atgctgtaac cgctacccag 180 agcccatgac tccagacact atgattaaactatctaggga agaaggcttg gcctacatct 240 ggatgccaac accagatatg agcaccgcaggccgagtaca gatgctgccc caggcggtgt 300 gcctgctgca tgcgctgctg gagaagggacacatcgtgta cgtgcactgc aacgctgggg 360 tgggccgctc caccgcggct gtctgcggctggctccagta tgtgatgggc tggaatctga 420 ggaaggtgca gtatttcctc atggccaagaggccggctgt ctacattgac gaagaggcag 480 ctagccagga cacatttcca ctataattttacaaagttaa atttataagc tagcattaag 540 taaagtgaag tccagctccc ttgctaaaaataactagagg taataattgg tattcaggta 600 actcatttac agtcataatg tgttgtgaaaatttaatctt aaaaattaaa tttttaaact 660 atgtgggtct gtgaatttct ttaatgtctaagaaatccag cttcataatt tccatgatac 720 aaagatcttt tttcaggtgg atttttacctttgttccttt tgctctgata gacaaaatca 780 gtttaggact attaaagaat gttttggaataaactgtctt tttcctcaat gaatgggatg 840 tctaatgtat ttcaaaatca cccaaaacttttggcaaata aaagcattta aaaagaaaaa 900 aaaaaaaaaa aaaaa 915 6 167 PRT Homosapiens 6 Lys Asn Arg His Glu Asp Tyr Ser Arg Ile Leu Pro Asn Ile TrpLeu 1 5 10 15 Gly Ser Cys Pro Arg Gln Val Glu His Val Thr Ile Lys LeuLys His 20 25 30 Glu Leu Gly Ile Thr Ala Val Met Asn Phe Gln Thr Glu TrpAsp Ile 35 40 45 Val Gln Asn Ser Ser Cys Cys Asn Arg Tyr Pro Glu Pro MetThr Pro 50 55 60 Asp Thr Met Ile Lys Leu Ser Arg Glu Glu Gly Leu Ala TyrIle Trp 65 70 75 80 Met Pro Thr Pro Asp Met Ser Thr Ala Gly Arg Val GlnMet Leu Pro 85 90 95 Gln Ala Val Cys Leu Leu His Ala Leu Leu Glu Lys GlyHis Ile Val 100 105 110 Tyr Val His Cys Asn Ala Gly Val Gly Arg Ser ThrAla Ala Val Cys 115 120 125 Gly Trp Leu Gln Tyr Val Met Gly Trp Asn LeuArg Lys Val Gln Tyr 130 135 140 Phe Leu Met Ala Lys Arg Pro Ala Val TyrIle Asp Glu Glu Ala Ala 145 150 155 160 Ser Gln Asp Thr Phe Pro Leu 1657 20 DNA Homo sapiens 7 gaatgctctt tccactttgc 20 8 19 DNA Homo sapiens 8ggctccttag ggaaatcag 19 9 22 DNA Homo sapiens 9 tccattgtgc taatgctatc tc22 10 22 DNA Homo sapiens 10 tcagcttgct ttgaggatat tt 22 11 20 DNA Homosapiens 11 cggcacgagg attattcaag 20 12 19 DNA Homo sapiens 12 gctcgggtactgaggtctg 19 13 22 DNA Homo sapiens 13 agttgttaca cagggttgtt gg 22 14 22DNA Homo sapiens 14 aggctgtaca tcagacagaa gg 22 15 22 DNA Homo sapiens15 tccattgtgc taatgctatc tc 22 16 22 DNA Homo sapiens 16 tcagcttgctttgaggatat tt 22 17 19 DNA Homo sapiens 17 gccgagtaca gatgctgcc 19 18 22DNA Homo sapiens 18 cacacagtcc tttcagttca gg 22 19 33 DNA Homo sapiens19 gcccgggtat tcgcgccgcc gccgcccgcc atg 33 20 19 DNA Homo sapiens 20atcatgaccg ttgctgtac 19 21 21 DNA Homo sapiens 21 tcatcatgac tgttgctgtac 21 22 10 PRT Artificial Sequence Description of Artificial SequenceConsensus sequence 22 His Cys Xaa Xaa Gly Xaa Xaa Arg Ser Thr 1 5 10 2310 RNA Artificial Sequence Description of Artificial Sequence Consensussequence 23 cccgccaugc 10 24 10 RNA Artificial Sequence Description ofArtificial Sequence Consensus sequence 24 gccrccaugg 10 25 22 PRT Homosapiens 25 Leu Ala Arg Ala Gln Glu Asp Phe Phe Gln Lys Phe Gly Lys ValArg 1 5 10 15 Ser Ser Val Cys Ser Leu 20 26 8 PRT Homo sapiens 26 AlaSer Gln Asp Thr Phe Pro Leu 1 5 27 11 PRT Homo sapiens 27 Val His CysAsn Ala Gly Val Gly Arg Ser Thr 1 5 10 28 11 PRT Homo sapiens 28 Val HisCys Ser Asp Gly Trp Asp Arg Thr Ala 1 5 10 29 11 PRT Homo sapiens 29 IleHis Cys Lys Ala Gly Lys Gly Arg Thr Gly 1 5 10 30 11 PRT Homo sapiens 30Val His Cys Ser Ala Gly Ile Gly Arg Ser Gly 1 5 10 31 11 PRT Homosapiens 31 Val His Cys Ser Ala Gly Ile Gly Arg Ser Gly 1 5 10 32 11 PRTHomo sapiens 32 Val His Cys Gln Ala Gly Ile Ser Arg Ser Ala 1 5 10

We claim:
 1. A nucleic acid containing a sequence encoding a proteintyrosine phosphatase which is associated with Lafora's disease having asequence as shown in SEQ.ID.NO.:
 1. 2. An isolated nucleic acid moleculecontaining a sequence encoding a protein tyrosine phosphatase which isassociated with Lafora's disease comprising (a) a nucleic acid sequenceas shown in SEQ.ID.NO.: 1, wherein T can also be U; (b) nucleic acidsequences complementary to (a); or (c) a nucleic acid molecule differingfrom any of the nucleic acids of (a) or (b) in codon sequences due tothe degeneracy of the genetic code.
 3. An isolated nucleic acid moleculewhich is associated with Lafora's disease consisting of a sequence asshown in SEQ ID NO:
 3. 4. An isolated nucleic acid molecule which isassociated with Lafora's disease consisting of a sequence as shown inSEQ ID NO: 5.